This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Asymptotic analysis of ML-covariance parameter estimators based on covariance approximations

Reinhard Furrer Department of Mathematics, Winterthurerstrasse 190, 8057 Zurich, Switzerland Michael Hediger Department of Mathematics, Winterthurerstrasse 190, 8057 Zurich, Switzerland
Abstract

Given a zero-mean Gaussian random field with a covariance function that belongs to a parametric family of covariance functions, we introduce a new notion of likelihood approximations, termed truncated-likelihood functions. Truncated-likelihood functions are based on direct functional approximations of the presumed family of covariance functions. For compactly supported covariance functions, within an increasing-domain asymptotic framework, we provide sufficient conditions under which consistency and asymptotic normality of estimators based on truncated-likelihood functions are preserved. We apply our result to the family of generalized Wendland covariance functions and discuss several examples of Wendland approximations. For families of covariance functions that are not compactly supported, we combine our results with the covariance tapering approach and show that ML estimators, based on truncated-tapered likelihood functions, asymptotically minimize the Kullback-Leibler divergence, when the taper range is fixed.
Keywords: Gaussian random fields, compactly supported covariance functions, likelihood approximations, consistency, asymptotic normality, covariance tapering.

00footnotetext: Email addresses: reinhard.furrer@math.uzh.ch, michael.hediger@math.uzh.ch

1   Introduction

1.1 On infill- and increasing-domain asymptotics

Maximum likelihood (ML) estimators for covariance parameters are highly popular in inference for random fields. Aiming towards asymptotic properties of such estimators, one needs to specify how the observation points and the associated sampling domain behave as the number of observation points increases. Two well-studied asymptotic frameworks are referred to as infill-domain asymptotics (also termed fixed-domain asymptotics) and increasing-domain asymptotics (see [13], p. 100100 for an introduction of terms). In infill-domain asymptotics, observation points are sampled within a bounded sampling domain, whereas in increasing-domain asymptotics, the sampling domain grows as the number of observation points increases. When referring to infill- and increasing-domain asymptotics, one often places additional assumptions on the minimum distance between any two distinct observation points. In increasing-domain asymptotics, the latter distance is often assumed to be bounded away from zero, while in infill-domain asymptotics, one frequently assumes that distinct observation points can be sampled arbitrarily close to each other (see for example [37]). There is a fair amount of literature which demonstrates that asymptotic properties of ML estimators for covariance parameters can be quite different under the two mentioned asymptotic frameworks (see [37] or more lately [6]). For example, it is known that some covariance parameters can not be estimated consistently under an infill-domain asymptotic framework ([34], [36]), whereas they can be estimated consistently, under given regularity conditions, within an increasing-domain asymptotic framework ([25], [4]). It is worth noting that in infill-domain asymptotics, these results can depend on the dimension dd of the Euclidean space d\mathbb{R}^{d}, where the random field is assumed to be observed. For example, when the true covariance function belongs to the Matérn family ([26]), and smoothness parameters are given, it is shown in [36], that for d=1,2,3d=1,2,3, the scale and variance parameters can not be estimated consistently via an ML approach in an infill-domain asymptotic framework. The case where d=4d=4 is still open, but for d5d\geq 5, it is shown in [2] that under infill-domain asymptotics, all covariance parameters of the Matérn family can be estimated consistently using an ML approach.

1.2 Compactly supported covariance functions

In recent years, the dataset sizes have steadily increased such that statistical analyses on random fields can become quite expensive in terms of computational resources (see for example [15] for a recent discussion). One prominent issue with large datasets is the large size of covariance matrices, constructed upon applying an underlying covariance function to given data. However, in certain fields of application, observed correlations are assumed to vanish beyond a certain cut-off distance (see [18] p. 750751750-751, and references therein, for an example in meteorology or also [10] and [19]). On the other hand, in the context of real valued random fields, it is common practice to multiply a presumed covariance function with a known positive-definite and compactly supported covariance function, called the covariance taper. The resulting compactly supported covariance function is referred to as the tapered covariance function. For an introduction to covariance tapering we refer to [17]. The use of compactly supported covariance functions can thus be of great importance for some fields of application. Not only do they potentially reflect the nature of the underlying covariance structure, but also, their application can lead to sparse covariance matrices. The latter are helpful in terms of the high computational costs in the context of large datasets. An excellent introduction to the construction of compactly supported covariance functions, associated to stationary and isotropic Gaussian random fields, is given in [21]. Additional results are available in [35], [28] and [11].

1.3 Motivation

The parametric family of generalized Wendland covariance functions represents one example of a family of compactly supported covariance functions which allows, similar to the Matérn family, for a continuous parametrization of smoothness (in the mean square sense) of the underlying random field. Its origin is due to Wendland ([32]) and an early adaptation for statistical applications was given by Gneiting ([20]). In its general form (see [21] and [28] for special cases) the generalized Wendland covariance function with smoothness parameters ν\nu and κ\kappa, variance parameter σ2\sigma^{2} and range parameter β\beta is given by

ϕ(t)σ2B(2κ,ν+1)β2κ+νtβw(w2t2)κ1(βw)ν𝑑w,\phi(t)\coloneqq\frac{\sigma^{2}}{\operatorname{B}(2\kappa,\nu+1)\beta^{2\kappa+\nu}}\int_{t}^{\beta}w(w^{2}-t^{2})^{\kappa-1}(\beta-w)^{\nu}dw, (1)

if t[0,β)t\in[0,\beta) and is zero otherwise. In the above display, B\operatorname{B} is the beta function. For technical details about valid parameter values, we refer to [9] or Section 6 of the present article. Clearly, in comparison with closed-form covariance functions, computing (1) is cumbersome, as it involves numerical integration. Depending on the support β\beta and a set of locations s1,,snds_{1},\dotsc,s_{n}\in\mathbb{R}^{d}, the n×nn\times n covariance matrix Σi,j=ϕ(sisj)\Sigma_{i,j}=\phi(\lVert s_{i}-s_{j}\rVert) requires at most n(n1)/2n(n-1)/2 calculations of (1). One strategy, which facilitates computing Σ\Sigma, is to reduce the number of times (1) must be calculated. As an illustration, we give three examples which involve approximations ϕi~\tilde{\phi_{i}}, i=1,2,3i=1,2,3, of ϕ\phi (respectively approximations Σ~i\widetilde{\Sigma}_{i} of Σ\Sigma):

  1. (ϕ1~\tilde{\phi_{1}})

    Truncation of the support

  2. (ϕ2~\tilde{\phi_{2}})

    linear interpolation

  3. (ϕ3~\tilde{\phi_{3}})

    addition of a nugget effect

For ϕ1~\tilde{\phi_{1}}, we truncate ϕ\phi to obtain ϕ1~\tilde{\phi_{1}} which has a smaller support compared to ϕ\phi. This becomes especially interesting, when the original function ϕ\phi tails off slowly (high degree of differentiability at the origin). As a result, Σ~1\widetilde{\Sigma}_{1} will be more sparse compared to Σ\Sigma. Example ϕ2~\tilde{\phi_{2}} is to predefine the numbers at which (1) is calculated. This is achieved by introducing a partition 0<t1<<tN=β0<t_{1}<\dotsc<t_{N}=\beta of the support of ϕ\phi. Then, ϕ2~\tilde{\phi_{2}} results in NN calculations of ϕ\phi. This defines a closed form approximation of ϕ\phi. Notice that t1,,tNt_{1},\dotsc,t_{N} do not need to be equispaced. Finally, ϕ3~\tilde{\phi_{3}} can be interpreted as a tuning option for a given approximation ϕ~\tilde{\phi}_{*} of ϕ\phi:

ϕ3~(t){ϕ~(t)+δ,t=0,ϕ~(t),t0,δ0.\tilde{\phi_{3}}(t)\coloneqq\begin{cases}\tilde{\phi}_{*}(t)+\delta,&t=0,\\ \tilde{\phi}_{*}(t),&t\neq 0,\end{cases}\quad\delta\geq 0.

With regard to practical usage, this form of approximation increases numerical stability. Further, it allows for more flexibility in practice, where the number of observations nn is given and Σ~\widetilde{\Sigma}_{*} based on ϕ~\tilde{\phi}_{*} might not be positive-definite.

Following up the above examples, we picture an approximation ϕ~\tilde{\phi} of ϕ\phi (respectively approximation Σ~\widetilde{\Sigma} of Σ\Sigma). Several questions arise:

  • What are conditions on ϕ~\tilde{\phi} to ensure that Σ~\widetilde{\Sigma} is asymptotically (as nn\xrightarrow[]{}\infty) equivalent to Σ\Sigma and eventually (for nn large enough) remains positive-definite?

  • In terms of ML estimators for covariance parameters, how shall a log-likelihood approximation based on ϕ~\tilde{\phi} be defined?

  • Under which conditions on ϕ~\tilde{\phi} are ML estimators based on ϕ~\tilde{\phi} consistent and asymptotically normal?

In the more general setting of a given parametric family of covariance functions, the present study gives a concrete context, where the latter questions are answered by introducing the notion of truncated-ML estimators.

1.4 Framework and contribution

Truncated-ML estimators for covariance parameters are based on truncated-likelihood functions. The latter are defined upon parametric families of sequences of functions, which approximate a presumed family of covariance functions on a common domain. Colloquially we will call these parametric sequences of functions covariance approximations. The respective matrices, constructed upon applying covariance approximations to a given collection of observation points, will be termed covariance matrix approximations. We will allow for covariance matrix approximations that are not necessarily positive semi-definite. Therefore, truncated-likelihood functions are more general than existing likelihood approximations methods such as low-rank, Vecchia, or covariance tapering approaches (see [22] for a summary of commonly used methods).

We work in an increasing-domain asymptotic framework, where collections of observation points are realizations of finite collections of a randomly perturbed regular grid (see also [4]). We consider a stationary Gaussian random field, with a zero-mean function and a true unknown covariance function that belongs to a given parametric family of covariance functions. If the presumed family of covariance functions is compactly supported, we provide sufficient conditions under which truncated-ML estimators and (regular) ML estimators for covariance parameters are consistent and asymptotically normal. Some conditions imposed on families of covariance functions are identical to the conditions that were already considered in [4]. The main difference is that we work with compactly supported covariance functions. Therefore, it is possible to simplify some of the conditions that were set up in [4]. As for statistical applications, we apply these results to the family of generalized Wendland covariance functions. In contrast to the infill-domain asymptotic framework considered in [9], we show that under the studied increasing-domain asymptotic framework, under some conditions on the parameter space, (regular) ML estimators for variance and range parameters are consistent and asymptotically normal. Further, we show that the same asymptotic results are recovered for truncated-ML estimators, based on various generalized Wendland approximations, such as truncations, linear interpolations and added nugget effects.

Additionally, we provide an extension to families of covariance functions which are not compactly supported. We combine our results with the covariance tapering approach. That is, we study covariance taper approximations and their asymptotic influence on the conditional Kullback-Leibler divergence of the misspecified distribution from the true distribution (see also [5]). We show that the latter divergence is minimized by truncated-tapered ML estimators.

1.5 Structure of the article

The rest of the article is organized as follows. Section 2 establishes the context. We introduce some primary notation, define the sampling domain and the random field itself. In Section 3 we introduce regularity conditions on covariance functions and approximations. In Section 4 we present intermediate asymptotic results on covariance matrices and approximations. Section 5 contains our main results: We introduce truncated-ML estimators and present results on consistency and asymptotic normality. In Section 6, we apply our results to the family of generalized Wendland covariance functions and discuss several examples of generalized Wendland approximations. Then, in the context of non-compactly supported covariance functions, Section 7 contains results on the asymptotic influence of taper approximations on the Kullback-Leibler divergence. Section 8 gives an outlook and some final comments. The Appendix is split into three parts. Covariance approximations for isotropic random fields are discussed in Appendix A. Appendix B contains additional supporting results, whereas all the proofs are left for Appendix C.

2   Context

2.1 Primary notation

The set +\mathbb{N}_{+} and +\mathbb{R}_{+} shall represent the set of positive integers and non-negative real numbers, respectively. For d+d\in\mathbb{N}_{+}, we use the notation B(x;r)\mathrm{B}(x;r) (B[x;r]\mathrm{B}[x;r]) for the open (closed) ball of radius r>0r>0 with center xdx\in\mathbb{R}^{d}. Given n+n\in\mathbb{N}_{+}, for some set AnA\subset\mathbb{R}^{n}, we write 𝔅(A)\mathfrak{B}(A) for the Borel σ\sigma-algebra on AA.

For a vector (w1,,wd)=wd(w_{1},\dotsc,w_{d})=w\in\mathbb{R}^{d}, we write w=(w12++wd2)1/2\left\lVert w\right\rVert=\big{(}w_{1}^{2}+\cdots+w_{d}^{2}\big{)}^{1/2} for the Euclidean norm of ww on d\mathbb{R}^{d}. In the case of d=1d=1 we use the notation ||\left\lvert\cdot\right\rvert for the Euclidean norm. For two vectors w,wdw,w^{\prime}\in\mathbb{R}^{d}, w,w=wtw=i=1dwiwi\langle w,w^{\prime}\rangle=w^{\mathrm{t}}w^{\prime}=\sum_{i=1}^{d}w_{i}w^{\prime}_{i} represents the inner product that induces \left\lVert\cdot\right\rVert on d\mathbb{R}^{d}. Given DdD\subset\mathbb{R}^{d}, we write C(D;S)\mathcal{B}_{\text{C}}(D;S) for the space of real valued, uniformly bounded functions on DD, having compact support SDS\subset D. If fC(D;S)f\in\mathcal{B}_{\text{C}}(D;S) and ff is also continuous, we use the notation 𝒞C(D;S)\mathcal{C}_{\text{C}}(D;S) instead of C(D;S)\mathcal{B}_{\text{C}}(D;S). For f𝒞C(D;S)f\in\mathcal{C}_{\text{C}}(D;S) we write f=sup{|f(h)|:hD}{\left\lVert f\right\rVert}_{\infty}=\sup\{\left\lvert f(h)\right\rvert\colon h\in D\} for the uniform norm on 𝒞C(D;S)\mathcal{C}_{\text{C}}(D;S). For vectors wdw\in\mathbb{R}^{d}, |w|=maxi=1,,d|wi|\left\lvert w\right\rvert_{\infty}=\max_{i=1,\dotsc,d}\left\lvert w_{i}\right\rvert denotes the uniform norm on d\mathbb{R}^{d}.

For a real n×nn\times n matrix AA, A2=max{z:ztz=1}z,AtAz1/2{\left\lVert A\right\rVert}_{2}=\max_{\{z\colon z^{\mathrm{t}}z=1\}}\langle z,A^{\mathrm{t}}Az\rangle^{1/2} denotes the spectral norm of AA. We write A0A\succ 0 (A0A\prec 0) to indicate that AA is positive-definite (negative-definite). Further, λ1(A)λn(A)\lambda_{1}(A)\geq\cdots\geq\lambda_{n}(A) denote the nn real eigenvalues of a matrix ASn×n()A\in\mathrm{S}_{n\times n}(\mathbb{R}), where Sn×n()\mathrm{S}_{n\times n}(\mathbb{R}) represents the space of real symmetric n×nn\times n matrices.

We use the notation f(x)=(fx1(x),,fxp(x))\nabla f(x)=\big{(}\frac{\partial f}{\partial x_{1}}(x),\dotsc,\frac{\partial f}{\partial x_{p}}(x)\big{)} for the gradient of ff at xx, where xf(x)x\mapsto f(x) is any differentiable, real valued function, defined on some EpE\subset\mathbb{R}^{p}. Further, for a vector valued, differentiable function g(x)=(g1(x),,gm(x))g(x)=(g_{1}(x),\dotsc,g_{m}(x)), with values in m\mathbb{R}^{m}, defined on some UpU\subset\mathbb{R}^{p}, we write Jg(x)l,k=glxk(x)J_{g}(x)_{l,k}=\frac{\partial g_{l}}{\partial x_{k}}(x), 1lm1\leq l\leq m, 1kp1\leq k\leq p, for the Jacobi-matrix of gg at xx.

A mapping YY from a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) to a measure space (E,𝒜)(E,\mathcal{A}) will be called a random element if it is /𝒜\mathcal{F}/\mathcal{A} measurable. If we write that Y:(Ω,)(E,𝒜)Y\colon(\Omega,\mathcal{F})\to(E,\mathcal{A}) is measurable, we mean that it is /𝒜\mathcal{F}/\mathcal{A} measurable. If (Yn)n+(Y_{n}){}_{n\in\mathbb{N}_{+}} denotes a sequence of random elements, where for any n+n\in\mathbb{N}_{+}, YnY_{n} is a mapping form a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) to a measure space (E,𝒜)(E,\mathcal{A}), we use the notation

YnnY and Ynnd,\displaystyle Y_{n}\xrightarrow[n\to\infty]{\mathbb{P}}Y\text{ and }Y_{n}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{L},

to indicate convergence of (Yn)n+(Y_{n}){}_{n\in\mathbb{N}_{+}} to a random element YY in probability and in distribution, respectively. Note that for convergence in distribution, the introduced notation indicates that the limit YY has law \mathcal{L} on (E,𝒜)(E,\mathcal{A}). A sequence of estimators (θ^n)n+(\hat{\theta}_{n})_{n\in\mathbb{N}_{+}} for θ0p\theta_{0}\in\mathbb{R}^{p} will be referred to as consistent if it converges in probability to θ0\theta_{0}. Finally, 𝒩(μ,Σ)\mathcal{N}(\mu,\Sigma) indicates a multivariate normal distribution with mean vector μ\mu and covariance matrix Σ\Sigma.

2.2 Random sampling scheme

On a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we consider a real valued Gaussian random function ZZ, which has sample functions on d\mathbb{R}^{d}. We assume that ZZ is stationary (homogeneous) with zero-mean function and covariance function cθ0(s)c_{\theta_{0}}(s), sds\in\mathbb{R}^{d}, where θ0Θ\theta_{0}\in\Theta, with Θp\Theta\subset\mathbb{R}^{p}, compact and convex. Thus, we consider a real valued random field {Zs:sd}\{Z_{s}\colon s\in\mathbb{R}^{d}\}, which has true and unknown covariance function cθ0c_{\theta_{0}} that belongs to a family of covariance functions {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}.

Let 𝒬[1,1]d\mathcal{Q}\coloneqq[-1,1]^{d} and X:Ω𝒬+X\colon\Omega\to\mathcal{Q}^{\mathbb{N}_{+}} be a stochastic process, defined on the same probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), but independent of ZZ. We assume that the sequence (Xi)i+(X_{i})_{i\in\mathbb{N}_{+}} is a sequence of independent random vectors with common law on 𝒬\mathcal{Q}, which has a strictly positive probability density function on 𝒬\mathcal{Q} (see also Remark 2.1). Given τ[0,1/2)\tau\in[0,1/2) and a sequence of deterministic points (vi)i+(v_{i}){}_{i\in\mathbb{N}_{+}}, with vi+dv_{i}\in\mathbb{N}_{+}^{d}, we define a randomly perturbed regular grid SS as the process

{Sivi+τXi:i+},\big{\{}S_{i}\coloneqq v_{i}+\tau X_{i}\colon i\in\mathbb{N}_{+}\big{\}}, (2)

where we assume that for all I+I\in\mathbb{N}_{+}, {vi,1iId}={1,,I}d\big{\{}v_{i},1\leq i\leq I^{d}\big{\}}=\big{\{}1,\dotsc,I\big{\}}^{d}. Therefore, for any ωΩ\omega\in\Omega, S(ω)S(\omega) is a sequence on +\mathbb{N}_{+}, with image S[+](ω)i=1(vi+τ𝒬)𝒢S[\mathbb{N}_{+}](\omega)\subset\prod_{i=1}^{\infty}\big{(}v_{i}+\tau\mathcal{Q}\big{)}\eqqcolon\mathcal{G} and any first IdI^{d} coordinates are in {1,,I}d+τ𝒬\{1,\dotsc,I\}^{d}+\tau\mathcal{Q} (see also Figure 1). At this point we remark that if nothing is mentioned, the parameter τ[0,1/2)\tau\in[0,1/2) and the sequence (vi)i+(v_{i}){}_{i\in\mathbb{N}_{+}} shall be fixed. Let X(n)(X1,,Xn)X_{(n)}\coloneqq(X_{1},\dotsc,X_{n}) and S(n)(S1,,Sn)S_{(n)}\coloneqq(S_{1},\dotsc,S_{n}) denote finite collections of XX and SS, respectively. We use the notation x(n)(x1,,xn)x_{(n)}\coloneqq(x_{1},\dotsc,x_{n}) for a vector that contains the first nn entries of a given sequence in 𝒬+\mathcal{Q}^{\mathbb{N}_{+}}. Correspondingly, given τ[0,1/2)\tau\in[0,1/2), v1,,vnv_{1},\dotsc,v_{n} and x(n)𝒬nx_{(n)}\in\mathcal{Q}^{n}, we write s(n)(s1,,sn)s_{(n)}\coloneqq(s_{1},\dotsc,s_{n}), si=vi+τxis_{i}=v_{i}+\tau x_{i}, for nn perturbed grid locations in 𝒢ni=1n(vi+τ𝒬)\mathcal{G}_{n}\coloneqq\prod_{i=1}^{n}(v_{i}+\tau\mathcal{Q}).

On (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we define the random vector

ωZ(n)(ω)(ZS1(ω)(ω),,ZSn(ω)(ω))=(zs1,,zsn)z(n),\displaystyle\omega\mapsto Z_{(n)}(\omega)\coloneqq\big{(}Z_{S_{1}(\omega)}(\omega),\dotsc,Z_{S_{n}(\omega)}(\omega)\big{)}=(z_{s_{1}},\dotsc,z_{s_{n}})\eqqcolon z_{(n)}, (3)

which denotes ZZ observed at a finite collection of SS. The situation, where a Gaussian random field is assumed to be observed at a randomly perturbed regular grid, with parameter τ\tau and deterministic points (vi)i+(v_{i}){}_{i\in\mathbb{N}_{+}}, as introduced above, is also considered in [4].

Refer to caption
Figure 1: For τ=0.4\tau=0.4 and I+I\in\mathbb{N}_{+}, a random field ZZ is observed at two realizations sis_{i} and sjs_{j} of Si=vi+τXiS_{i}=v_{i}+\tau X_{i} and Sj=vj+τXjS_{j}=v_{j}+\tau X_{j}, iji\neq j, 1i,jI21\leq i,j\leq I^{2}. Dotted and dashed lines mark the borders of the ranges of SiS_{i} and SjS_{j}, respectively.

Given θΘ\theta\in\Theta, we let Σθ(s(n))[cθ(sisj)]1i,jn\Sigma_{\theta}(s_{(n)})\coloneqq[c_{\theta}(s_{i}-s_{j})]_{1\leq i,j\leq n} denote the non-random n×nn\times n covariance matrix based on an arbitrary s(n)𝒢ns_{(n)}\in\mathcal{G}_{n}. On (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we write

ωΣn,θ(ω)Σθ(S(n)(ω)),θΘ,\displaystyle\omega\mapsto\Sigma_{n,\theta}(\omega)\coloneqq\Sigma_{\theta}(S_{(n)}(\omega)),\quad\theta\in\Theta,

for the n×nn\times n random covariance matrix based on a finite collection S(n)S_{(n)} of SS.

Remark 2.1.

Some technical remarks are worth pointing out. We assume that the random function Z(s,ω)Zs(ω)Z(s,\omega)\coloneqq Z_{s}(\omega), sds\in\mathbb{R}^{d}, is measurable as a function from the measure space (d×Ω,𝔅(d))(\mathbb{R}^{d}\times\Omega,\mathfrak{B}(\mathbb{R}^{d})\otimes\mathcal{F}) to (,𝔅())(\mathbb{R},\mathfrak{B}(\mathbb{R})). That is to say that ZZ is (jointly) measurable. This condition makes sure that the components ωZSi(ω)(ω)=Z(Si(ω),ω)\omega\mapsto Z_{S_{i}(\omega)}(\omega)=Z(S_{i}(\omega),\omega), i=1,,ni=1,\dotsc,n, of (3) are /𝔅()\mathcal{F}/\mathfrak{B}(\mathbb{R}) measurable as the composition of the measurable functions ω(Si(ω),ω)\omega\mapsto(S_{i}(\omega),\omega) and (s,ω)Z(s,ω)(s,\omega)\mapsto Z(s,\omega). Thus, the random vector Z(n)Z_{(n)} is well defined. Since ZZ and SS are independent, it is readily seen that the conditional distribution of Z(n)Z_{(n)} given S(n)=s(n)S_{(n)}=s_{(n)} is Gaussian, with characteristic function exp((1/2)atΣn,θ0(ω)a)\operatorname{exp}(-(1/2)a^{\mathrm{t}}\Sigma_{n,\theta_{0}}(\omega)a), ana\in\mathbb{R}^{n}. In addition, we note that for fixed ωΩ\omega\in\Omega, S[+](ω)S[\mathbb{N}_{+}](\omega) is not bounded and if we define Δτ12τ\Delta_{\tau}\coloneqq 1-2\tau, we are given some fixed Δτ>0\Delta_{\tau}>0, which is independent of n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, such that

infn+inf1i,jnijsisjΔτ.\inf_{n\in\mathbb{N}_{+}}\inf_{\begin{subarray}{c}1\leq i,j\leq n\\ i\neq j\end{subarray}}\left\lVert s_{i}-s_{j}\right\rVert\geq\Delta_{\tau}. (4)

Hence, we are in an increasing-domain asymptotic framework where the minimum distance between any two distinct observation points is bounded away from zero. The assumption that for any given i+i\in\mathbb{N}_{+}, XiX_{i} has strictly positive probability density function on 𝒬\mathcal{Q}, is purely technical (see also the proof of Theorem 5.2). As it can be seen from the mentioned proof, if τ=0\tau=0, the assumption becomes redundant.

3   Regularity Conditions on covariance functions and covariance approximations

3.1 Regularity conditions on the family of covariance functions

Assumption 3.1 (Regularity conditions on cθc_{\theta}).
  1. (1)

    There exist real constants CC, L<L<\infty, which are independent of θΘ\theta\in\Theta, such that cθC(d;Sθ)c_{\theta}\in\mathcal{B}_{\text{C}}(\mathbb{R}^{d};S_{\theta}), with SθB[0;C]S_{\theta}\subset\mathrm{B}[0;C] and cθL{\left\lVert c_{\theta}\right\rVert}_{\infty}\leq L.

  2. (2)

    For any sds\in\mathbb{R}^{d}, the first, second and third order partial derivatives of θcθ(s)\theta\mapsto c_{\theta}(s) exist. In addition, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, qcθθi1θiqC(d;Sθ(i1,,iq))\frac{\partial^{q}c_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\in\mathcal{B}_{\text{C}}\big{(}\mathbb{R}^{d};S_{\theta}(i_{1},\dotsc,i_{q})\big{)}, where there exist constants CL<C^{\prime}\text{, }L^{\prime}<\infty, which are independent of θΘ\theta\in\Theta, such that Sθ(i1,,iq)B[0;C]S_{\theta}(i_{1},\dotsc,i_{q})\subset\mathrm{B}[0;C^{\prime}] and qcθθi1θiqL\big{\lVert}\frac{\partial^{q}c_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\big{\rVert}_{\infty}\leq~{}L^{\prime}.

  3. (3)

    Fourier inversion holds, that is for any θΘ\theta\in\Theta

    cθ(s)=dc^θ(f)eif,sdf,\displaystyle c_{\theta}(s)=\int_{\mathbb{R}^{d}}\!\hat{c}_{\theta}(f)\operatorname{e}^{\mathrm{i}\langle f,s\rangle}\mathrm{d}\!f,

    with Θ×d(θ,f)c^θ(f)\Theta\times\mathbb{R}^{d}\ni(\theta,f)\mapsto\hat{c}_{\theta}(f) continuous and strictly positive.

Remark 3.1.

Note that (1) and (2) of Assumption 3.1 are different to the conditions assumed in [4] (compare also to Condition 3.23.2 imposed in [5], or Condition 44 stated in [7]). In [4] it is assumed that a given covariance function kθk_{\theta} is not only bounded on d\mathbb{R}^{d}, but also it decays sufficiently fast in the Euclidean norm on d\mathbb{R}^{d}. Explicitly, it is assumed in Condition 2.1 of [4] that there exists a finite constant AA, which is independent of θΘ\theta\in\Theta, such that for any sds\in\mathbb{R}^{d}, |kθ(s)|A/(1+sd+1)\left\lvert k_{\theta}(s)\right\rvert\leq A/(1+\lVert s\rVert^{d+1}). This polynomial decay condition on kθk_{\theta} can be interpreted as a summability condition on the entries of the respective covariance matrices Kθ(s(n))i,jkθ(sisj)K_{\theta}(s_{(n)})_{i,j}\coloneqq k_{\theta}(s_{i}-s_{j}), which guaranties that the maximal eigenvalues of Kθ(s(n))K_{\theta}(s_{(n)}) are uniformly bounded in n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta (see Lemmas D.1 and D.5 in [4]). Note that the exponent d+1d+1 can be replaced by d+αd+\alpha, with α>0\alpha>0 some fixed constant (see also (6) in [6]). In the present study we show that under the assumption of a minimal spacing between any two distinct observation points, if cθc_{\theta} has compact support on d\mathbb{R}^{d}, the number of possible observation points, which are covered by the support of cθc_{\theta}, must be bounded uniformly in n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta (see Lemma B.1). This, together with the condition that cθc_{\theta} is also uniformly bounded on Θ\Theta and d\mathbb{R}^{d}, will be sufficient to conclude that the maximal eigenvalues of Σθ(s(n))\Sigma_{\theta}(s_{(n)}) are uniformly bounded in n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta (see Lemmas 4.1 and B.3). Similar remarks can be made with regard to the conditions imposed on the partial derivatives of cθc_{\theta} with respect to θ\theta (see Lemma B.5). In addition, (3) of Assumption 3.1 is also imposed in [4] (compare also to [8] and [7]). It guarantees that the minimal eigenvalues of Σθ(s(n))\Sigma_{\theta}(s_{(n)}) are bounded from below, uniformly in n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta (see Lemmas 4.1 and B.3). Finally, we remark that within the framework of compactly supported covariance functions, the given conditions are very minimal and can be considered as classical in the context of ML estimation. Especially, if one is not interested in the asymptotic distribution, and rather seeks conditions under which ML estimators are consistent (with regard to a concrete example, we refer to Remark 6.2).

3.2 Regularity conditions on the family covariance approximations

Given θΘ\theta\in\Theta, we let (c~m,θ)m+(\tilde{c}_{m,\theta})_{m\in\mathbb{N}_{+}} denote a sequence of real valued functions defined on d\mathbb{R}^{d}. The families {(c~m,θ)m+:θΘ}\big{\{}(\tilde{c}_{m,\theta})_{m\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}} can be put under the following assumption.

Assumption 3.2 (Regularity conditions on c~m,θ\tilde{c}_{m,\theta}).
  1. (1)

    For any θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, the function c~m,θ:(d,𝔅(d))(,𝔅())\tilde{c}_{m,\theta}\colon\big{(}\mathbb{R}^{d},\mathfrak{B}(\mathbb{R}^{d})\big{)}\to\big{(}\mathbb{R},\mathfrak{B}(\mathbb{R})\big{)} is measurable and such that c~m,θ(s)=c~m,θ(s)\tilde{c}_{m,\theta}(s)=\tilde{c}_{m,\theta}(-s) for any sds\in\mathbb{R}^{d}.

  2. (2)

    For any m+m\in\mathbb{N}_{+}, c~m,θ\tilde{c}_{m,\theta} satisfies (1) of Assumption 3.1, where respective constants C~\widetilde{C} and L~\widetilde{L} can be further chosen independently of m+m\in\mathbb{N}_{+}.

  3. (3)

    supθΘc~m,θcθm0\sup_{\theta\in\Theta}{\left\lVert\tilde{c}_{m,\theta}-c_{\theta}\right\rVert}_{\infty}\xrightarrow[]{m\to\infty}0.

  4. (4)

    For any m+m\in\mathbb{N}_{+}, c~m,θ\tilde{c}_{m,\theta} satisfies (2) of Assumption 3.1, where respective constants C~\widetilde{C}^{\prime} and L~\widetilde{L}^{\prime} can be further chosen independently of m+m\in\mathbb{N}_{+}.

  5. (5)

    For any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, we have that

    supθΘqc~m,θθi1θiqqcθθi1θiqm0.\displaystyle\sup_{\theta\in\Theta}{\left\lVert\frac{\partial^{q}\tilde{c}_{m,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}-\frac{\partial^{q}c_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right\rVert}_{\infty}\xrightarrow[]{m\to\infty}0.

To make the notation easier, we write (c~m,θ)(c~m,θ)m+(\tilde{c}_{m,\theta})\coloneqq(\tilde{c}_{m,\theta}){}_{m\in\mathbb{N}_{+}}. In the following, we formally introduce covariance matrix approximations (random and non-random versions). To do so, let r:++r\colon\mathbb{N}_{+}\to\mathbb{N}_{+} be such that r(n)r(n)\xrightarrow{}\infty as nn\xrightarrow{}\infty. Given s(n)𝒢ns_{(n)}\in\mathcal{G}_{n}, we let Σ~θ(s(n))[c~r(n),θ(sisj)]1i,jn\widetilde{\Sigma}_{\theta}(s_{(n)})\coloneqq[\tilde{c}_{r(n),\theta}(s_{i}-s_{j})]_{1\leq i,j\leq n} denote the non-random n×nn\times n matrix based on a given family {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}. Then, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), if {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} is a family of Borel measurable sequences of functions, we write

ωΣ~n,θ(ω)Σ~θ(S(n)(ω)),\displaystyle\omega\mapsto\widetilde{\Sigma}_{n,\theta}(\omega)\coloneqq\widetilde{\Sigma}_{\theta}(S_{(n)}(\omega)),

for the n×nn\times n random matrix based on a finite collection S(n)S_{(n)} of SS. Colloquially we will use the term covariance approximation when we refer to a given family {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}, which can approximate a family of covariance functions {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} in the sense of Assumption 3.2. In these terms {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} itself is a covariance approximation. The expression covariance matrix approximation will be used for both, Σ~θ(s(n))\widetilde{\Sigma}_{\theta}(s_{(n)}) and its random version Σ~n,θ\widetilde{\Sigma}_{n,\theta}. Similar, we use the expression covariance matrix for both, Σθ(s(n))\Sigma_{\theta}(s_{(n)}) and Σn,θ\Sigma_{n,\theta}.

Remark 3.2.

(1), (2) and (4) of Assumption 3.2 are natural extensions of (1) and (2) of Assumption 3.1. Notice that the measurability condition imposed in (1) of Assumption 3.2 makes sure that Σ~n,θ\widetilde{\Sigma}_{n,\theta} is /𝔅(n2)\mathcal{F}/\mathfrak{B}(\mathbb{R}^{n^{2}}) measurable. Condition (3) of Assumption 3.2 specifies in which sense a family {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} approximates the family {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}. We require that (c~m,θ)(\tilde{c}_{m,\theta}) converges uniformly on d\mathbb{R}^{d} to cθc_{\theta}, where the convergence is also uniform on the parameter space Θ\Theta. In fact, we will show (see Lemmas B.3 and 4.1) that the uniform convergence of (c~m,θ)(\tilde{c}_{m,\theta}) to cθc_{\theta}, together with the condition that the families {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} and {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} have uniformly bounded compact support, are, among others, sufficient criteria to proof that the matrices Σθ(s(n))\Sigma_{\theta}(s_{(n)}) and Σ~θ(s(n))\widetilde{\Sigma}_{\theta}(s_{(n)}) are asymptotically (as nn\xrightarrow[]{}\infty) equivalent, uniformly on Θ\Theta and 𝒢\mathcal{G}. Condition (5) of Assumption 3.2 will allow us to conclude that a similar result holds true for the first, second and third order partial derivatives (with respect to θ\theta) of Σ~θ(s(n))\widetilde{\Sigma}_{\theta}(s_{(n)}) and Σθ(s(n))\Sigma_{\theta}(s_{(n)}). For concrete examples of covariance approximations, where the conditions of Assumption 3.2 are verified, we refer to Section 6.

4   Uniform asymptotic equivalence of covariance matrices and covariance matrix approximations

This section presents intermediate results on covariance matrices and approximations. In particular, Lemma 4.1 gives precise conditions under which Σ~n,θ\widetilde{\Sigma}_{n,\theta} eventually (for nn large enough) remains positive-definite with \mathbb{P} probability one.

Lemma 4.1.

Assume that the family {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies (1) and (3) of Assumption 3.1. Consider {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} that satisfies (1), (2) and (3) of Assumption 3.2. Then, we have that  a.s.\mathbb{P}\text{ a.s.}

supn+supθΘΣn,θ2< and supn+supθΘΣ~n,θ2<.\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{n,\theta}\big{\rVert}_{2}<\infty\text{ and }\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\big{\lVert}\widetilde{\Sigma}_{n,\theta}\big{\rVert}_{2}<\infty.

In particular we can conclude that  a.s.\mathbb{P}\text{ a.s.}

supθΘΣn,θΣ~n,θ2n0.\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{n,\theta}-\widetilde{\Sigma}_{n,\theta}\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0.

Further, it is true that  a.s.\mathbb{P}\text{ a.s.}

infn+infθΘλn(Σn,θ)>0,\inf_{n\in\mathbb{N}_{+}}\inf_{\theta\in\Theta}\lambda_{n}\big{(}\Sigma_{n,\theta}\big{)}>0,

and there exists N+N\in\mathbb{N}_{+} such that  a.s.\mathbb{P}\text{ a.s.}

infnNinfθΘλn(Σ~n,θ)>0.\inf_{n\geq N}\inf_{\theta\in\Theta}\lambda_{n}\big{(}\widetilde{\Sigma}_{n,\theta}\big{)}>0.

5   Truncated-ML estimators

Given a square matrix AA, we define det+(A)\operatorname{det}_{+}(A) to be the product of the strictly positive eigenvalues of AA. If all of the eigenvalues are less or equal to zero, det+(A)=1\operatorname{det}_{+}(A)=1. Further, we use the notation A+A^{+} for the pseudoinverse of AA (sometimes called Moore-Penrose inverse). For the given collection {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}, we define, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), for any n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, the random variable

ln(θ)\displaystyle l_{n}(\theta) 1nlog(det+(Σn,θ))+1nZ(n),Σn,θ+Z(n).\displaystyle\coloneqq\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}_{+}(\Sigma_{n,\theta})\big{)}+\frac{1}{n}\big{\langle}Z_{(n)},\Sigma_{n,\theta}^{+}Z_{(n)}\big{\rangle}. (5)

Given ωΩ\omega\in\Omega, θln(θ)(ω)\theta\mapsto l_{n}(\theta)(\omega) shall be called the truncated-modified log-likelihood function based on {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}. A sequence of estimators (θ^n(c))n+\big{(}{\hat{\theta}_{n}(c)}\big{)}{}_{n\in\mathbb{N}_{+}}, defined on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), will be called a sequence of truncated-ML estimators for θ0\theta_{0} based on {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}, if for any n+n\in\mathbb{N}_{+},

θ^n(c)argminθΘln(θ).\displaystyle\hat{\theta}_{n}(c)\in\operatorname*{argmin}_{\theta\in\Theta}{l}_{n}(\theta).

Similarly, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), for a given collection of sequences of real valued functions {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}, we introduce, for any n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, the random variable

l~n(θ)\displaystyle\tilde{l}_{n}(\theta) 1nlog(det+(Σ~n,θ))+1nZ(n),Σ~n,θ+Z(n).\displaystyle\coloneqq\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}_{+}(\widetilde{\Sigma}_{n,\theta})\big{)}+\frac{1}{n}\big{\langle}Z_{(n)},\widetilde{\Sigma}_{n,\theta}^{+}Z_{(n)}\big{\rangle}. (6)

Then, for ωΩ\omega\in\Omega, the function θl~n(θ)(ω)\theta\mapsto\tilde{l}_{n}(\theta)(\omega) denotes the truncated-modified log-likelihood function based on {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}. A sequence of estimators (θ^n(c~))n+\big{(}\hat{\theta}_{n}(\tilde{c})\big{)}{}_{n\in\mathbb{N}_{+}}, defined on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), will be called a sequence of truncated-ML estimators for θ0\theta_{0} based on {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}, if for any n+n\in\mathbb{N}_{+}

θ^n(c~)argminθΘl~n(θ).\hat{\theta}_{n}(\tilde{c})\in\operatorname*{argmin}_{\theta\in\Theta}{\tilde{l}}_{n}(\theta). (7)

At this point is important to note that for a given ωΩ\omega\in\Omega, it is in general not true that ln(θ)(ω)l_{n}(\theta)(\omega) and l~n(θ)(ω)\tilde{l}_{n}(\theta)(\omega) are continuous in θ\theta for any n+n\in\mathbb{N}_{+}. Nevertheless, a consequence of Lemma 4.1 is the following proposition:

Proposition 5.1.

Assume that the family {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies (1) and (3) of Assumption 3.1. Consider {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} that satisfies (1), (2) and (3) of Assumption 3.2. Then, we have that for any n+n\in\mathbb{N}_{+},  a.s.\mathbb{P}\text{ a.s.},

ln(θ)=1nlog(det(Σn,θ))+1nZ(n),Σn,θ1Z(n).\displaystyle l_{n}(\theta)=\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}(\Sigma_{n,\theta})\big{)}+\frac{1}{n}\big{\langle}Z_{(n)},\Sigma_{n,\theta}^{-1}Z_{(n)}\big{\rangle}.

Further there exists N+N\in\mathbb{N}_{+} such that for any nNn\geq N,  a.s.\mathbb{P}\text{ a.s.},

l~n(θ)=1nlog(det(Σ~n,θ))+1nZ(n),Σ~n,θ1Z(n),\displaystyle\tilde{l}_{n}(\theta)=\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}(\widetilde{\Sigma}_{n,\theta})\big{)}+\frac{1}{n}\big{\langle}Z_{(n)},\widetilde{\Sigma}_{n,\theta}^{-1}Z_{(n)}\big{\rangle},

and we have that

supθΘ|ln(θ)l~n(θ)|n0.\displaystyle\sup_{\theta\in\Theta}\big{\lvert}l_{n}(\theta)-\tilde{l}_{n}(\theta)\big{\rvert}\xrightarrow[n\to\infty]{\mathbb{P}}0.

Using Proposition 5.1, we notice that if, for any sds\in\mathbb{R}^{d} and m+m\in\mathbb{N}_{+}, both θcθ(s)\theta\mapsto c_{\theta}(s) and θc~m,θ(s)\theta\mapsto\tilde{c}_{m,\theta}(s) are kk times differentiable, we have that θln(θ)(ω)\theta\mapsto l_{n}(\theta)(\omega) and θl~n(θ)(ω)\theta\mapsto\tilde{l}_{n}(\theta)(\omega) are kk times differentiable for nn large enough, respectively.

For the rest of the article, if we refer to truncated-ML estimators (without mentioning further whether estimators are based on families of covariance functions or approximations), we refer to both, truncated-ML estimators based on families of covariance functions and approximations. The same is applied for the notion of truncated-modified log-likelihood functions based on either covariance functions or approximations. However, if {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies the assumptions of Proposition 5.1, a sequence of truncated-ML estimators (θ^n(c))n+\big{(}\hat{\theta}_{n}(c)\big{)}{}_{n\in\mathbb{N}_{+}} shall be simply called a sequence of ML estimators for θ0\theta_{0}. Similarly, we will simply refer to a modified log-likelihood function when the given family {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} is under the assumptions of Proposition 5.1.

Remark 5.1.

The introduction of truncated-modified log-likelihood functions is not standard. Modified refers to the fact that the log-likelihood for the Gaussian density function of a random vector (Zs1,,Zsn)(Z_{s_{1}},\dotsc,Z_{s_{n}}) is scaled by 2/n-2/n. This is common practice in the literature about ML estimators for covariance parameters under an increasing-domain asymptotic framework (see for instance [4], [5] and also [7]). The matrices Σn,θ(ω)\Sigma_{n,\theta}(\omega) and Σ~n,θ(ω)\widetilde{\Sigma}_{n,\theta}(\omega) are not necessarily positive-definite. In particular, Σ~n,θ(ω)\widetilde{\Sigma}_{n,\theta}(\omega) can be negative-definite. If the matrices Σn,θ(ω)\Sigma_{n,\theta}(\omega) and Σ~n,θ(ω)\widetilde{\Sigma}_{n,\theta}(\omega) are not positive-definite, we truncate the log-likelihood by a pseudo-determinant and -inverse to obtain the functions θln(θ)(ω)\theta\mapsto l_{n}(\theta)(\omega) and θl~n(θ)(ω)\theta\mapsto\tilde{l}_{n}(\theta)(\omega). Hence, the use of the expression “truncated”.

Remark 5.2.

As it was mentioned in Remark 2.22.2 of [5], for given ωΩ\omega\in\Omega, we allow the functions θln(θ)(ω)\theta\mapsto l_{n}(\theta)(\omega) and θl~n(θ)(ω)\theta\mapsto\tilde{l}_{n}(\theta)(\omega) to have more than one minimizer. In which case the asymptotic results given in Section 5.1 hold true for any given sequence of truncated-ML estimators. With regard to the existence of a minimizer we refer to Remark 2.12.1 in [4].

5.1 Consistency and asymptotic normality of truncated-ML estimators

The main results of this section are that under suitable conditions on the families of covariance functions and approximations, truncated-ML estimators for covariance parameters are not only consistent (Theorem 5.2 and Corollary 5.3) but also asymptotically normal (Theorem 5.4 and Corollary 5.5). In particular, we will make use of the conditions presented in Assumptions 3.1 and 3.2. However, in the context of random fields that are observed at randomly perturbed regular grid locations as defined in (2), we will further make use of the following two technical conditions that were also imposed in [4]. Associated to the common range 𝒬\mathcal{Q}, of the process XX, we define the set Dτzd{0}(z+τU𝒬)D_{\tau}\coloneqq\bigcup_{z\in\mathbb{Z}^{d}\setminus\{0\}}(z+\tau U_{\mathcal{Q}}), where U𝒬{u1u2:u1𝒬,u2𝒬}U_{\mathcal{Q}}\coloneqq\{u_{1}-u_{2}\colon u_{1}\in\mathcal{Q},u_{2}\in\mathcal{Q}\} denotes the set of differences between two points in 𝒬\mathcal{Q}.

Assumption 5.1 (Asymptotic identifiability around θ0\theta_{0}).

For τ=0\tau=0, there does not exists θθ0\theta\neq\theta_{0} such that cθ(z)=cθ0(z)c_{\theta}(z)=c_{\theta_{0}}(z) for all zdz\in\mathbb{Z}^{d}. If τ0\tau\neq 0, there does not exists θθ0\theta\neq\theta_{0} such that scθ(s)cθ0(s)s\mapsto c_{\theta}(s)-c_{\theta_{0}}(s) is zero a.e. with respect to the Lebesgue measure on DτD_{\tau} and cθ(0)=cθ0(0)c_{\theta}(0)=c_{\theta_{0}}(0).

Assumption 5.2 (Local identifiability around θ0\theta_{0}).

For τ=0\tau=0, there does not exists p{0}α=(α1,,αp)\mathbb{R}^{p}\setminus\{0\}\ni\alpha=(\alpha_{1},\dotsc,\alpha_{p}) such that k=1pαkcθ0θk(z)=0\sum_{k=1}^{p}\alpha_{k}\frac{\partial c_{\theta_{0}}}{\partial\theta_{k}}(z)=0 for all zdz\in\mathbb{Z}^{d}. For τ0\tau\neq 0, there does not exists p{0}α=(α1,,αp)\mathbb{R}^{p}\setminus\{0\}\ni\alpha=(\alpha_{1},\dotsc,\alpha_{p}) such that sk=1pαkcθ0θk(s)s\mapsto\sum_{k=1}^{p}\alpha_{k}\frac{\partial c_{\theta_{0}}}{\partial\theta_{k}}(s) is zero a.e. with respect to the Lebesgue measure on DτD_{\tau} and k=1pαkcθ0θk(0)=0\sum_{k=1}^{p}\alpha_{k}\frac{\partial c_{\theta_{0}}}{\partial\theta_{k}}(0)=0.

Theorem 5.2.

Let (θ^n(c~))n+\big{(}\hat{\theta}_{n}(\tilde{c})\big{)}{}_{n\in\mathbb{N}_{+}} be a sequence of truncated-ML estimators for θ0\theta_{0} based on {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}. Assume that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1 (regarding (2), q=1q=1 and the continuity of first order partial derivatives is sufficient) and Assumption 5.1. Suppose further that {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption 3.2 (regarding (4) and (5), q=1q=1 and the continuity of first order partial derivatives is sufficient). Then, we have that

θ^n(c~)nθ0.\displaystyle\hat{\theta}_{n}(\tilde{c})\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0}.

The following corollary is immediate.

Corollary 5.3.

Suppose that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1 (regarding (2), q=1q=1 and the continuity of first order partial derivatives is sufficient) and Assumption 5.1. Then, we can conclude that a sequence of ML estimators (θ^n(c))n+\big{(}\hat{\theta}_{n}(c)\big{)}{}_{n\in\mathbb{N}_{+}} for θ0\theta_{0} is consistent.

Before we present the results about asymptotic normality, it is helpful to consider some additional notation. Let K+K\in\mathbb{N}_{+}, such that for any ωΩ\omega\in\Omega, the sequences of functions

(ln,K(θ)(ω))n+(ln+K1(θ)(ω))n+\big{(}l_{n,K}\!(\theta)(\omega)\big{)}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}l_{n+K-1}(\theta)(\omega)\big{)}_{n\in\mathbb{N}_{+}} (8)

and

(l~n,K(θ)(ω))n+(l~n+K1(θ)(ω))n+\big{(}\tilde{l}_{n,K}\!(\theta)(\omega)\big{)}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\tilde{l}_{n+K-1}(\theta)(\omega)\big{)}_{n\in\mathbb{N}_{+}} (9)

are differentiable with respect to θ\theta. Note that if {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1 and the collection {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption 3.2, then we know about the existence of such a KK under application of Proposition 5.1. For the given K+K\in\mathbb{N}_{+}, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we introduce the sequence of random functions

{(ωGn,K(ω,θ)Gn,K(θ))n+:θΘ},\displaystyle\big{\{}\big{(}\underbrace{\omega\mapsto G_{n,K}(\omega,\theta)}_{\eqqcolon G_{n,K}(\theta)}\big{)}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}},

where for n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, the random vector Gn,K(θ)G_{n,K}(\theta) has components Gj,n,K(θ)G_{j,n,K}(\theta), j=1,,pj=1,\dotsc,p, with

Gj,n,K(θ)=ln,Kθj(θ)𝔼[ln,Kθj(θ)|S(n)],\displaystyle G_{j,n,K}(\theta)=\frac{\partial l_{n,K}}{\partial\theta_{j}}(\theta)-\mathbb{E}\bigg{[}\frac{\partial l_{n,K}}{\partial\theta_{j}}(\theta)\;\bigg{|}\;S_{(n)}\bigg{]},

and thus

Gn,K(θ)=ln,K(θ)𝔼[ln,K(θ)|S(n)].G_{n,K}(\theta)=\nabla l_{n,K}(\theta)-\mathbb{E}\big{[}\nabla l_{n,K}(\theta)\;\big{|}\;S_{(n)}\big{]}. (10)

Similarly, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we introduce the sequence of random functions

{(ωG~n,K(ω,θ)G~n,K(θ))n+:θΘ},\displaystyle\big{\{}\big{(}\underbrace{\omega\mapsto\widetilde{G}_{n,K}(\omega,\theta)}_{\eqqcolon\widetilde{G}_{n,K}(\theta)}\big{)}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}},

where for any n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, the components of G~n,K(θ)\widetilde{G}_{n,K}(\theta) are given by

G~j,n,K(θ)=l~n,Kθj(θ)𝔼[l~n,Kθj(θ)|S(n)],j=1,,p,\displaystyle\widetilde{G}_{j,n,K}(\theta)=\frac{\partial\tilde{l}_{n,K}}{\partial\theta_{j}}(\theta)-\mathbb{E}\bigg{[}\frac{\partial\tilde{l}_{n,K}}{\partial\theta_{j}}(\theta)\;\bigg{|}\;S_{(n)}\bigg{]},\quad j=1,\dotsc,p,

and thus

G~n,K(θ)=l~n,K(θ)𝔼[l~n,K(θ)|S(n)].\widetilde{G}_{n,K}(\theta)=\nabla\tilde{l}_{n,K}(\theta)-\mathbb{E}\big{[}\nabla\tilde{l}_{n,K}(\theta)\;\big{|}\;S_{(n)}\big{]}. (11)

If the collection {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1, we simply write, for any n+n\in\mathbb{N}_{+},

JGn(θ0)JGn,1(θ0),\displaystyle J_{G_{n}}(\theta_{0})\coloneqq J_{G_{n,1}}(\theta_{0}),

for the random Jacobi-matrix of θGn,1(θ)\theta\mapsto G_{n,1}(\theta) evaluated at θ0\theta_{0}.

Theorem 5.4.

Let (θ^n(c~))n+\big{(}\hat{\theta}_{n}(\tilde{c})\big{)}{}_{n\in\mathbb{N}_{+}} be an sequence of truncated-ML estimators for θ0\theta_{0} based on {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}. Suppose that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumptions 3.1, 5.1 and 5.2. Suppose further that {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption 3.2. Then, we have that

n1/2(θ^n(c~)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\tilde{c})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}), (12)

where Sp×pΛ0\mathrm{S}_{p\times p}\ni\Lambda\succ 0 is deterministic and such that

12JGn(θ0)nΛn12JG~n,N(θ0),\displaystyle\frac{1}{2}J_{G_{n}}(\theta_{0})\xrightarrow[n\to\infty]{\mathbb{P}}\Lambda\xleftarrow[n\to\infty]{\mathbb{P}}\frac{1}{2}J_{\widetilde{G}_{n,N}}(\theta_{0}),

with N+N\in\mathbb{N}_{+} as in Proposition 5.1.

Corollary 5.5.

Suppose that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumptions 3.1, 5.1 and 5.2. Then, we can conclude that a sequence of ML estimators (θ^n(c))n+\big{(}\hat{\theta}_{n}(c)\big{)}{}_{n\in\mathbb{N}_{+}} for θ0\theta_{0} is such that

n1/2(θ^n(c)θ0)nd𝒩(0,Λ1),\displaystyle n^{1/2}(\hat{\theta}_{n}(c)-\theta_{0})\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

with Λ\Lambda as in Theorem 5.4.

Remark 5.3.

Under Assumptions 3.1, for any K+K\in\mathbb{N}_{+}, 𝔼[ln,K(θ)|S(n)]=0\mathbb{E}\left[\nabla l_{n,K}(\theta)\;|\;S_{(n)}\right]=0 with \mathbb{P} probability one. However, even if {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} is under Assumptions 3.1 and {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} is under Assumptions 3.2, it is not in general true that  a.s.\mathbb{P}\text{ a.s.} 𝔼[l~n,N(θ)|S(n)]=0\mathbb{E}\big{[}\nabla\tilde{l}_{n,N}(\theta)\;|\;S_{(n)}\big{]}=0, where NN is as in Proposition 5.1. Notice further that under Assumptions 3.1, for ωΩ\omega\in\Omega, (n/2)JGn(θ0)(ω)-(n/2)J_{G_{n}}(\theta_{0})(\omega) represents the second derivative of the log-likelihood θ(n/2)ln(θ)(ω)\theta\mapsto-(n/2)l_{n}(\theta)(\omega) based on {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}.

6   Example of application: Generalized Wendland functions

In this section we work in the same setting as in Section 2.2, but we additionally assume that ZZ is isotropic. Explicitly, for the given family of covariance functions {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}, we assume that there exists a parametric family {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} such that for any θΘ\theta\in\Theta, sds\in\mathbb{R}^{d}, cθ(s)=φθ(s)c_{\theta}(s)=\varphi_{\theta}(\left\lVert s\right\rVert). The family {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} is called the radial version of {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}. We can recycle the notation of Section 3 and easily translate Assumptions 3.1 and 3.2 by considering families of approximations {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}} for {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} on +\mathbb{R}_{+}. This allows us to readily recover the results of Sections 4 and 5 for isotropic random fields. For the details we refer to Assumptions A.1 and A.2, as well as Theorems A.1 and A.2 in Appendix A.

In terms of an explicit family of radial covariance functions, we reconsider the generalized Wendland covariance function which we have already introduced in (1) of Section 1.3. Let Θθ(σ2,β)\Theta\ni\theta\coloneqq(\sigma^{2},\beta), where Θ[σmin2,σmax2]×[βmin,βmax]\Theta\coloneqq[\sigma^{2}_{\min},\sigma^{2}_{\max}]\times[\beta_{\min},\beta_{\max}], with 0<σmin2<σmax2<0<\sigma^{2}_{\min}<\sigma^{2}_{\max}<\infty and 12τ<βmin<βmax<1-2\tau<\beta_{\min}<\beta_{\max}<\infty. We assume that the covariance function of the random field ZZ is given by ϕθ0(s)\phi_{\theta_{0}}(\left\lVert s\right\rVert), sds\in\mathbb{R}^{d}, θ0Θ\theta_{0}\in\Theta, where ϕθ0\phi_{\theta_{0}} belongs to the family {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} which is defined by

ϕθ(t)σ2ϕν,κ(tβ),t[0,),\phi_{\theta}(t)\coloneqq\sigma^{2}\phi_{\nu,\kappa}\bigg{(}\frac{t}{\beta}\bigg{)},\quad t\in[0,\infty), (13)

where

ϕν,κ(r){1B(2κ,ν+1)r1u(u2r2)κ1(1u)ν𝑑u,r[0,1),0,r[1,),\phi_{\nu,\kappa}(r)\coloneqq\begin{cases}\frac{1}{\operatorname{B}(2\kappa,\nu+1)}\int_{r}^{1}u(u^{2}-r^{2})^{\kappa-1}(1-u)^{\nu}du,&r\in[0,1),\\ 0,&r\in[1,\infty),\end{cases}

compare to (1) of Section 1.3. We treat κ\kappa and ν\nu as given but such that κ>0\kappa>0 and ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa. Notice that the latter restriction on κ\kappa and ν\nu makes sure that for any θΘ\theta\in\Theta, ϕθ\phi_{\theta} belongs to the class Φd\Phi_{d}, the class of real valued and continuous functions, defined on +\mathbb{R}_{+}, which are strictly positive at the origin and such that for any finite collection of points in d\mathbb{R}^{d}, evaluation at the Euclidean norm of pairwise differences between points of the collection results in a non-negative definite matrix (see for example [21]). Actually, in the latter reference it is argued that for κ>0\kappa>0, ϕν,κΦd\phi_{\nu,\kappa}\in\Phi_{d} if and only if ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa. For the respective family defined on d\mathbb{R}^{d}, we use the notation wθ(s)ϕθ(s)w_{\theta}(s)\coloneqq\phi_{\theta}(\lVert s\rVert).

Remark 6.1.

The restriction βmin>12τ\beta_{\min}>1-2\tau is imposed to proof that the family {wθ:θΘ}\{w_{\theta}\colon\theta\in\Theta\} satisfies Assumptions 5.1 and 5.2 (see the proof of Propositions 6.2). This is not surprising, as 12τ1-2\tau defines the minimal spacing between pairs of distinct observation points of the randomly perturbed regular grid, defined in (2) of Section 2.2. Further, as we have noted that ϕν,κΦd\phi_{\nu,\kappa}\in\Phi_{d} if and only if ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa, the two smoothness parameters ν\nu and κ\kappa can not be estimated without further constraints.

Proposition 6.1.

Let κ>4\kappa>4. Then, the family {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} satisfies Assumption A.1, where for any θΘ\theta\in\Theta and for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, the functions tϕθ(t)t\mapsto\phi_{\theta}(t) and tqϕθθi1θiq(t)t\mapsto\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(t) are continuous on +\mathbb{R}_{+}.

Proposition 6.2.

Let κ>2\kappa>2. Then, the family {wθ:θΘ}\{w_{\theta}\colon\theta\in\Theta\} satisfies Assumptions 5.1 and 5.2.

Using Propositions 6.1 and 6.2, under application of Theorems A.1 and A.2 (recall also Corollaries 5.3 and 5.5), we obtain the following result:

Proposition 6.3.

Let κ>4\kappa>4. A sequence (θ^n(ϕ))n+\big{(}\hat{\theta}_{n}(\phi)\big{)}{}_{n\in\mathbb{N}_{+}} of ML estimators for θ0\theta_{0} based on {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} is consistent. Further there exists a non-random symmetric p×pp\times p matrix Λ0\Lambda\succ 0 such that

n1/2(θ^n(ϕ)θ0)nd𝒩(0,Λ1).\displaystyle n^{1/2}\big{(}\hat{\theta}_{n}(\phi)-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}).
Remark 6.2.

It is worth to note that the restriction κ>4\kappa>4 is only needed for the asymptotic distribution of ML estimators, respectively truncated-ML estimators. In particular, in Proposition 6.1, if one only demands conditions involving first order partial derivatives of ϕθ\phi_{\theta}, with respect to θ\theta, κ>2\kappa>2 is sufficient. With regard to consistency of the estimator (θ^n(ϕ))n+\big{(}\hat{\theta}_{n}(\phi)\big{)}{}_{n\in\mathbb{N}_{+}} in Proposition 6.3, κ>2\kappa>2 is sufficient as well. The same applies for the truncated-ML estimators considered in Examples 6.16.26.3 and 6.4. Keeping in mind the differentiability conditions imposed in Assumption A.1, the given restrictions on κ\kappa are not surprising (compare also to [9], within the infill-domain asymptotic framework).

We discuss four examples of generalized Wendland approximations.

Example 6.1 (Truncation of ϕθ\phi_{\theta}).

Let {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} be as in Proposition 6.1. Let {(𝔗m,θ):θΘ}\big{\{}(\mathfrak{T}_{m,\theta})\colon\theta\in\Theta\big{\}} be defined as follows: For θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, we set,

𝔗m,θ(t)ϕθ𝟙[0,Cm](t),t+,Cm as m.\displaystyle{\mathfrak{T}_{m,\theta}(t)}\coloneqq\phi_{\theta}\mathbbm{1}_{[0,C_{m}]}(t),\quad t\in\mathbb{R}_{+},\;\text{$C_{m}\xrightarrow[]{}\infty$ as $m\xrightarrow[]{}\infty$.}
Proposition 6.4.

A sequence (θ^n(𝔗))n+\big{(}\hat{\theta}_{n}(\mathfrak{T})\big{)}{}_{n\in\mathbb{N}_{+}} of truncated-ML estimators for θ0\theta_{0} based on {(𝔗m,θ):θΘ}\big{\{}(\mathfrak{T}_{m,\theta})\colon\theta\in\Theta\big{\}} is consistent and we have that

n1/2(θ^n(𝔗)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\mathfrak{T})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

where Λ\Lambda is defined as in Proposition 6.3.

In the following we let M<M<\infty denote a real constant, which is independent of β[βmin,βmax]\beta\in[\beta_{\min},\beta_{\max}] such that βmaxM\beta_{\max}\leq M.

Example 6.2 (Trimmed Bernstein polynomials).

Let {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} be as in Proposition 6.1. We consider a family {(𝔓m,θ):θΘ}\big{\{}(\mathfrak{P}_{m,\theta})\colon\theta\in\Theta\big{\}} defined as follows: For θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, we set for t+t\in\mathbb{R}_{+},

𝔓m,θ(t){Bm,θ(t;bm),tM,0,t>M,\displaystyle{\mathfrak{P}_{m,\theta}(t)}\coloneqq\begin{cases}B_{m,\theta}(t;b_{m}),&t\leq M,\\ 0,&t>M,\end{cases}

with

Bm,θ(t;bm)=k=0mϕθ(bmkm)(mk)(tbm)k(1tbm)mk,\displaystyle B_{m,\theta}(t;b_{m})=\sum_{k=0}^{m}\phi_{\theta}\bigg{(}b_{m}\frac{k}{m}\bigg{)}\binom{m}{k}\bigg{(}\frac{t}{b_{m}}\bigg{)}^{k}\bigg{(}1-\frac{t}{b_{m}}\bigg{)}^{m-k},

the Bernstein polynomial of the function ϕθ\phi_{\theta} on [0,bm)[0,b_{m}), where bmmb_{m}\xrightarrow[]{m\to\infty}\infty and we assume that bm=o(m)b_{m}=o(m). Thus, for any 0km0\leq k\leq m,

bmk+1mbmkmm0,\displaystyle b_{m}\frac{k+1}{m}-b_{m}\frac{k}{m}\xrightarrow[]{m\to\infty}0,

the distance between adjacent points converge to zero as mm approaches infinity. See also [12] for an introduction of Bernstein polynomials on unbounded intervals.

Proposition 6.5.

The family {(𝔓m,θ):θΘ}\big{\{}(\mathfrak{P}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2.

Using Propositions 6.1, 6.2 and 6.5, under application of Theorems A.1 and A.2, we have proven the following result:

Proposition 6.6.

A sequence (θ^n(𝔓))n+\big{(}\hat{\theta}_{n}(\mathfrak{P})\big{)}{}_{n\in\mathbb{N}_{+}} of truncated-ML estimators for θ0\theta_{0} based on {(𝔓m,θ):θΘ}\big{\{}(\mathfrak{P}_{m,\theta})\colon\theta\in\Theta\big{\}} is consistent and we have that

n1/2(θ^n(𝔓)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\mathfrak{P})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

where Λ\Lambda is defined as in Proposition 6.3.

Example 6.3 (Linear interpolation).

Let {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} be as in Proposition 6.1. For a given m+m\in\mathbb{N}_{+}, we consider a partition of the interval [0,M][0,M], 0=t0t1tNm=M0=t_{0}\leq t_{1}\leq\dotsc\leq t_{N_{m}}=M, where NmmN_{m}\xrightarrow[]{m\to\infty}\infty and for 0kNm0\leq k\leq N_{m}, tk+1mtkmm0t_{k+1}^{m}-t_{k}^{m}\xrightarrow[]{m\to\infty}0. Then, we define the family {(𝔏m,θ):θΘ}\big{\{}(\mathfrak{L}_{m,\theta})\colon\theta\in\Theta\big{\}} as follows: For θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, we set for t+t\in\mathbb{R}_{+},

𝔏m,θ(t){Im,θ(t;Nm),tM,0,t>M,\displaystyle{\mathfrak{L}_{m,\theta}(t)}\coloneqq\begin{cases}I_{m,\theta}(t;N_{m}),&t\leq M,\\ 0,&t>M,\end{cases}

where

Im,θ(t;Nm)={ϕθ(tkm)+ϕθ(tk+1m)ϕθ(tkm)tk+1mtkm(ttkm),t[tkm,tk+1m],0,t[tkm,tk+1m].\displaystyle I_{m,\theta}(t;N_{m})=\begin{cases}\phi_{\theta}(t_{k}^{m})+\frac{\phi_{\theta}(t_{k+1}^{m})-\phi_{\theta}(t_{k}^{m})}{t_{k+1}^{m}-t_{k}^{m}}(t-t_{k}^{m}),&t\in[t_{k}^{m},t_{k+1}^{m}],\\ 0,&t\not\in[t_{k}^{m},t_{k+1}^{m}].\end{cases}

Thus, for a given m+m\in\mathbb{N}_{+}, 𝔏m,θ\mathfrak{L}_{m,\theta} represents a linear interpolation of the function ϕθ\phi_{\theta} on the interval [0,M][0,M].

Proposition 6.7.

The family {(𝔏m,θ):θΘ}\big{\{}(\mathfrak{L}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2.

Using Propositions 6.1, 6.2 and 6.7, under application of Theorems A.1 and A.2, we have further proven the following result:

Proposition 6.8.

A sequence (θ^n(𝔏))n+\big{(}\hat{\theta}_{n}(\mathfrak{L})\big{)}{}_{n\in\mathbb{N}_{+}} of truncated-ML estimators for θ0\theta_{0} based on {(𝔏m,θ):θΘ}\big{\{}(\mathfrak{L}_{m,\theta})\colon\theta\in\Theta\big{\}} is consistent and we have that

n1/2(θ^n(𝔏)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\mathfrak{L})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

where Λ\Lambda is defined as in Proposition 6.3.

Example 6.4 (Vanishing nugget effect).

Let {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} be as in Proposition 6.1 and consider a family {(ϕ~m,θ):θΘ}\big{\{}\big{(}\tilde{\phi}_{m,\theta}\big{)}\colon\theta\in\Theta\big{\}} that satisfies Assumption A.2. Then, define for any θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, the function

𝔖m,θ(t){ϕ~m,θ(t)+δ(m),t=0,ϕ~m,θ(t),t0,{\mathfrak{S}_{m,\theta}(t)}\coloneqq\begin{cases}\tilde{\phi}_{m,\theta}(t)+\delta(m),&t=0,\\ \tilde{\phi}_{m,\theta}(t),&t\neq 0,\end{cases} (14)

where (δ(m))m+(\delta(m)){}_{m\in\mathbb{N}_{+}} is independent of θΘ\theta\in\Theta and t+t\in\mathbb{R}_{+} and such that δ(m)0\delta(m)\xrightarrow{}0, as mm\xrightarrow{}\infty. Note that since the family {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} satisfies Assumption A.2, we could also choose ϕ~m,θϕθ\tilde{\phi}_{m,\theta}\equiv\phi_{\theta} in (14).

Proposition 6.9.

A sequence (θ^n(𝔖))n+\big{(}\hat{\theta}_{n}\big{(}\mathfrak{S}\big{)}\big{)}{}_{n\in\mathbb{N}_{+}} of truncated-ML estimators for θ0\theta_{0} based on {(𝔖m,θ):θΘ}\big{\{}\big{(}\mathfrak{S}_{m,\theta}\big{)}\colon\theta\in\Theta\big{\}} is consistent and we have that

n1/2(θ^n(𝔖)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\mathfrak{S})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

where Λ\Lambda is defined as in Proposition 6.3.

Remark 6.3.

As it was already mentioned in the introduction, computing (13) is costly. However, if κ\kappa is a positive integer, closed form solutions of (13) exist. More specifically, if κ=k+\kappa=k\in\mathbb{N}_{+}, then

ϕν,k(r)=Aν+k(r)Pk(r),\displaystyle\phi_{\nu,k}(r)=A_{\nu+k}(r)P_{k}(r),

where PkP_{k} is a polynomial of order kk and Aν+kA_{\nu+k} the Askey function ([3]) of order ν+k\nu+k,

Aν+k(r)={(1r)ν+k,0r<1,0,r1.\displaystyle A_{\nu+k}(r)=\begin{cases}(1-r)^{\nu+k},&0\leq r<1,\\ 0,&r\geq 1.\end{cases}

In addition, if κ(+1/2)\kappa\in(\mathbb{N}_{+}-1/2), a positive half-integer, it is shown in [28] that further closed form solutions of (13), involving polynomial, logarithmic and square root terms, exist. Thus, in the specific example of generalized Wendland covariance functions, covariance approximations will facilitate computing (13) when κ+(+1/2)\kappa\notin\mathbb{N}_{+}\cup(\mathbb{N}_{+}-1/2).

7   Covariance taper approximations: Beyond compactly supported covariance functions

Asymptotic properties of (regular) tapered-ML estimators were addressed in both the infill- and increasing-domain asymptotic framework (see [23], [14], [30] and [16]). The direct functional approximation approach studied here can be combined with covariance tapering. Given observations of SS, it is known that under weak assumptions on the presumed covariance function, ML estimators based on tapered covariance functions (tapered-ML estimators) preserve consistency (see [16], in particular Corollary 2 in the increasing-domain framework). However, this is the case for covariance tapers that have a compact support which is not fixed, but rather grows to the entire d\mathbb{R}^{d} as the number of observations from SS increases. Within an increasing-domain asymptotic framework, given a fixed compact support of the covariance taper, one can in general not expect tapered-ML estimators to be consistent. Still, under suitable conditions, tapered-ML estimators asymptotically minimize the Kullback-Leibler divergence (see for instance Theorem 3.3 in [5]). Given the theory developed here, we can readily recover the same result for truncated-tapered ML estimators, ML estimators based on tapered covariance function, where the covariance taper is replaced with a functional approximation of it. To be more formal, let us remain in the setting of Section 2, but assume that ZZ has true and unknown covariance function kθ0k_{\theta_{0}}, θ0Θ\theta_{0}\in\Theta, which belongs to a family {kθ:θΘ}\{k_{\theta}\colon\theta\in\Theta\} which satisfies:

  • For any sds\in\mathbb{R}^{d}, θkθ(s)\theta\mapsto k_{\theta}(s) is continuously differentiable

  • There exist constants A<A<\infty and α>0\alpha>0 such that for all i=1,,pi=1,\dotsc,p, for all sds\in\mathbb{R}^{d} and for all θΘ\theta\in\Theta, |kθ(s)|A/(1+sd+α)\lvert k_{\theta}(s)\rvert\leq A/(1+\lVert s\rVert^{d+\alpha}) and |kθθi(s)|A/(1+sd+α)\big{\lvert}\frac{\partial k_{\theta}}{\partial\theta_{i}}(s)\big{\rvert}\leq A/(1+\lVert s\rVert^{d+\alpha})

  • {kθ:θΘ}\{k_{\theta}\colon\theta\in\Theta\} satisfies (3) of Assumption 3.1.

The given assumptions are very weak and satisfied for instance for the Matérn family (see also Condition 2.1 in [4] or Remark 3.1). Then, we consider a fixed covariance taper stθ0(s)s\mapsto t_{\theta^{\prime}_{0}}(s), θ0Θ\theta^{\prime}_{0}\in\Theta^{\prime}, Θl\Theta^{\prime}\subset\mathbb{R}^{l}, compact and convex. We assume that tθ0t_{\theta^{\prime}_{0}} belongs to a family of tapers {tθ:θΘ}\{t_{\theta^{\prime}}\colon\theta^{\prime}\in\Theta^{\prime}\} that satisfies Assumption 3.1 (regarding (2), q=1q=1 and the continuity of first order partial derivatives is sufficient). As we have seen in Section 6 (Proposition 6.1), we may choose, with θ0=(β0,1)\theta^{\prime}_{0}=(\beta_{0},1), κ>2\kappa>2, ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa, a generalized Wendland taper (see also Remark 6.2). In the given context it is more convenient to write tβ0tθ0t_{\beta_{0}}\coloneqq t_{\theta^{\prime}_{0}}, where β0\beta_{0} is the taper range, that is tβ0(s)=0t_{\beta_{0}}(s)=0 for sβ0\lVert s\rVert\geq\beta_{0}. Based on a finite collection S(n)S_{(n)} of SS, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), we then define the tapered n×nn\times n covariance matrix Rn,θi,jkθ(SiSj)tβ0(SiSj){R_{n,\theta}}_{i,j}\coloneqq k_{\theta}(S_{i}-S_{j})t_{\beta_{0}}(S_{i}-S_{j}), 1i,jn1\leq i,j\leq n. Additionally, we consider a covariance matrix approximation

R~n,θi,j=kθ(SiSj)t~r(n),θ0(SiSj),1i,jn,r(n) as n,\displaystyle\widetilde{R}_{{n,\theta}_{i,j}}=k_{\theta}(S_{i}-S_{j})\tilde{t}_{r(n),\theta^{\prime}_{0}}(S_{i}-S_{j}),\quad 1\leq i,j\leq n,\;\text{$r(n)\xrightarrow[]{}\infty$ as $n\xrightarrow[]{}\infty$,}

of Rn,θR_{n,\theta}, where (t~m,θ0)(\tilde{t}_{m,\theta^{\prime}_{0}}) is a sequence of functions that belongs to a family of taper approximations {(t~m,θ):θΘ}\{(\tilde{t}_{m,\theta^{\prime}})\colon\theta^{\prime}\in\Theta^{\prime}\}, for which Assumption 3.2 applies. Again, we write t~m,θ0t~m,β0\tilde{t}_{m,\theta^{\prime}_{0}}\coloneqq\tilde{t}_{m,\beta_{0}}, m+m\in\mathbb{N}_{+}, to highlight the fixed range parameter. We note that the results of Lemma 4.1 and Proposition 5.1 remain true with Σn,θ\Sigma_{n,\theta} and Σ~n,θ\widetilde{\Sigma}_{n,\theta} replaced with Rn,θR_{n,\theta} and R~n,θ\widetilde{R}_{n,\theta}, respectively. We know (see Remark 2.1) that the conditional distribution of Z(n)Z_{(n)} given S(n)S_{(n)} is given by the random variable ω𝒩(0,Kn,θ0(ω))\omega\mapsto\mathcal{N}(0,K_{n,\theta_{0}}(\omega)). On the other hand, we can assume a misspecified distribution ω𝒩(0,Rn,θ(ω))\omega\mapsto\mathcal{N}(0,R_{n,\theta}(\omega)), where the true covariance matrix is replaced with the tapered covariance matrix Rn,θ(ω)R_{n,\theta}(\omega), θΘ\theta\in\Theta. Then, we define the scaled (see [5]) conditional Kullback-Leibler divergence of 𝒩(0,Rn,θ)\mathcal{N}(0,R_{n,\theta}) from 𝒩(0,Kn,θ0)\mathcal{N}(0,K_{n,\theta_{0}}),

dn,θ1nlog(det(Rn,θKn,θ01))+1ntr(Kn,θ0Rn,θ1)1.\displaystyle d_{n,\theta}\coloneqq\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}(R_{n,\theta}K_{n,\theta_{0}}^{-1})\big{)}+\frac{1}{n}\operatorname{tr}(K_{n,\theta_{0}}R_{n,\theta}^{-1})-1.

The distribution 𝒩(0,Rn,θ)\mathcal{N}(0,R_{n,\theta}) shall be called a regular taper miss-specified distribution. If we choose nNn\geq N (NN as in Proposition 5.1), we can even further misspecify the distribution of Z(n)Z_{(n)} given S(n)S_{(n)} by replacing Rn,θR_{n,\theta} with R~n,θ\widetilde{R}_{n,\theta} in 𝒩(0,Rn,θ)\mathcal{N}(0,R_{n,\theta}). This gives rise to the scaled conditional Kullback-Leibler divergence of 𝒩(0,R~n,θ)\mathcal{N}(0,\widetilde{R}_{n,\theta}) from 𝒩(0,Kn,θ0)\mathcal{N}(0,K_{n,\theta_{0}}),

d~n,θ1nlog(det(R~n,θKn,θ01))+1ntr(Kn,θ0R~n,θ1)1.\displaystyle\tilde{d}_{n,\theta}\coloneqq\frac{1}{n}\operatorname{log}\big{(}\operatorname{det}(\widetilde{R}_{n,\theta}K_{n,\theta_{0}}^{-1})\big{)}+\frac{1}{n}\operatorname{tr}(K_{n,\theta_{0}}\widetilde{R}_{n,\theta}^{-1})-1.

We use the notation (θ^n(kt))n+\big{(}\hat{\theta}_{n}(kt)\big{)}{}_{n\in\mathbb{N}_{+}} and (θ^n(kt~))n+\big{(}\hat{\theta}_{n}(k\tilde{t})\big{)}{}_{n\in\mathbb{N}_{+}} for ML and truncated-ML estimators for θ0\theta_{0} with respect to {kθtβ0:θΘ}\{k_{\theta}t_{\beta_{0}}\colon\theta\in\Theta\} and {(kθt~m,β0):θΘ}\{(k_{\theta}\tilde{t}_{m,\beta_{0}})\colon\theta\in\Theta\}, respectively. In accordance with the literature about tapered-ML estimators, the estimators (θ^n(kt))n+\big{(}\hat{\theta}_{n}(kt)\big{)}{}_{n\in\mathbb{N}_{+}} and (θ^n(kt~))n+\big{(}\hat{\theta}_{n}(k\tilde{t})\big{)}{}_{n\in\mathbb{N}_{+}} are then further referred to as tapered-ML estimators and truncated-tapered ML estimators, respectively. We can now state the following theorem:

Theorem 7.1.

We have that  a.s.\mathbb{P}\text{ a.s.}

supθΘ|dn,θd~n,θ|n0,\sup_{\theta\in\Theta}\big{\lvert}d_{n,\theta}-\tilde{d}_{n,\theta}\big{\rvert}\xrightarrow[]{n\to\infty}0, (15)

and as nn\xrightarrow[]{}\infty,

dn,θ^n(kt~)=infθΘdn,θ+δn,d_{n,\hat{\theta}_{n}(k\tilde{t})}=\inf_{\theta\in\Theta}d_{n,\theta}+\delta_{n}, (16)

where δnn0\delta_{n}\xrightarrow[n\to\infty]{\mathbb{P}}0.

Therefore, in the given scenario, truncated-tapered ML estimators asymptotically minimize the conditional Kullback-Leibler divergence of taper misspecified distributions from the true distribution (compare also to Theorem 3.3 in [5]). Thus, in terms of Kullback-Leibler divergence, truncated-tapered ML estimators and tapered-ML estimators perform asymptotically equally well.

8   Discussion and outlook

With the introduction of truncated-likelihood functions, we allow for more far-reaching forms of covariance approximations, such as linear interpolations or polynomial approximations. Our approximation approach relates directly to the presumed covariance function. Thus, combinations with existing approximation methods such as low-rank or covariance tapering approaches are well possible. We studied the quality of truncated-ML estimators from an asymptotic point of view. For compactly supported covariance functions, the conditions imposed in Sections 3 and 5 permit us to obtain truncated-ML estimators that are asymptotically well-behaving. That is, we obtain estimators that are consistent and asymptotically normal. Our proof strategies were strongly influenced by [4]. We have provided a comprehensive analysis for the family of generalized Wendland covariance functions. That is, we give precise conditions on smoothness, variance and range parameters, under which ML estimators for variance and range parameters are consistent and asymptotically normal. To our knowledge, this does not exists in the literature so far (compare also to [9], within the infill-domain asymptotic context). Further, we gave four examples of generalized Wendland approximations, for which truncated-ML estimators preserve consistency and asymptotic normality.

We now discuss some open questions. Our results on consistency and asymptotic normality depend on the condition that correlations vanish beyond a certain distance. It would be of interest to recover the consistency and asymptotic normality results for truncated-ML estimators, where the assumption of a compact support is dropped. To this end, we recall that the imposed conditions on covariance functions and approximations resulted in the uniform asymptotic equivalence of covariance matrices and approximations. Using this, we established the existence of a positive integer NN, after which covariance matrix approximations remain positive-definite. Expanding to non-compactly supported covariance function, this result remains unchanged, as long as covariance matrices and approximations are uniformly asymptotically equivalent (uniformly on the parameter and sample space). Thus, in this case, consistency and asymptotic normality can be recovered, even when presumed covariance functions are no longer compactly supported. However, as a mere condition, the asymptotic equivalence of covariance matrices and approximations is of little practical importance. Thus, the case of non-compactly supported covariance functions deserves further attention.

From a more applied point of view, our results provide a strong theoretical basis for further research. It remains to test and extend the given examples of covariance approximations. The four examples of generalized Wendland approximations and their effect on parameter estimations were discussed from a theoretical point of view. An important next step is to provide numerical implementations and practical comparisons.

In conclusion, for large datasets built upon correlated data, the present work provides an essential missing piece in the area of covariance approximations.

Appendix A Covariance approximations for isotropic random fields

We consider families of approximations {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}} for {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} on +\mathbb{R}_{+} and translate (recycling the notation of Section 3) Assumptions 3.1 and 3.2 as follows:

Assumption A.1 (Regularity conditions on φθ\varphi_{\theta}).
  1. (1)

    There exist real constants CC, L<L<\infty, which are independent of θΘ\theta\in\Theta, such that φθC(+;Sθ)\varphi_{\theta}\in\mathcal{B}_{\text{C}}(\mathbb{R}_{+};S_{\theta}), with Sθ[0,C]S_{\theta}\subset[0,C] and φθL{\left\lVert\varphi_{\theta}\right\rVert}_{\infty}\leq L.

  2. (2)

    For any t+t\in\mathbb{R}_{+}, the first, second and third order partial derivatives of θφθ(t)\theta\mapsto\varphi_{\theta}(t) exist. In addition, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, qφθθi1θiqC(+;Sθ(i1,,iq))\frac{\partial^{q}\varphi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\in\mathcal{B}_{\text{C}}(\mathbb{R}_{+};S_{\theta}(i_{1},\dotsc,i_{q})), where there exist constants CL<C^{\prime}\text{, }L^{\prime}<\infty, which are independent of θΘ\theta\in\Theta, such that Sθ(i1,,iq)[0,C]S_{\theta}(i_{1},\dotsc,i_{q})\subset[0,C^{\prime}] and qφθθi1θiqL\big{\lVert}\frac{\partial^{q}\varphi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\big{\rVert}_{\infty}\leq~{}L^{\prime}.

  3. (3)

    Fourier inversion holds, that is for any θΘ\theta\in\Theta,

    φθ(s)=dc^θ(f)eif,sdf,\displaystyle\varphi_{\theta}(\left\lVert s\right\rVert)=\int_{\mathbb{R}^{d}}\!\hat{c}_{\theta}(f)\operatorname{e}^{\mathrm{i}\langle f,s\rangle}\mathrm{d}\!f,

    where Θ×d(θ,f)c^θ(f)\Theta\times\mathbb{R}^{d}\ni(\theta,f)\mapsto\hat{c}_{\theta}(f) is continuous and strictly positive.

Assumption A.2 (Regularity conditions on φ~m,θ\tilde{\varphi}_{m,\theta}).
  1. (1)

    For any θΘ\theta\in\Theta, for any m+m\in\mathbb{N}_{+}, the function φ~m,θ:(+,𝔅(+))(,𝔅())\tilde{\varphi}_{m,\theta}\colon(\mathbb{R}_{+},\mathfrak{B}(\mathbb{R}_{+}))\to(\mathbb{R},\mathfrak{B}(\mathbb{R})) is measurable.

  2. (2)

    For any m+m\in\mathbb{N}_{+}, φ~m,θ\tilde{\varphi}_{m,\theta} satisfies (1) of Assumption A.1, where respective constants C~\widetilde{C} and L~\widetilde{L} can be further chosen independently of m+m\in\mathbb{N}_{+}.

  3. (3)

    supθΘφ~m,θφθm0\sup_{\theta\in\Theta}{\left\lVert\tilde{\varphi}_{m,\theta}-\varphi_{\theta}\right\rVert}_{\infty}\xrightarrow[]{m\to\infty}0.

  4. (4)

    For any m+m\in\mathbb{N}_{+}, φ~m,θ\tilde{\varphi}_{m,\theta} satisfies (2) of Assumption A.1, where respective constants C~\widetilde{C}^{\prime} and L~\widetilde{L}^{\prime} can be further chosen independently of m+m\in\mathbb{N}_{+}.

  5. (5)

    For any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, we have that

    supθΘqφ~m,θθi1θiqqφθθi1θiqm0.\displaystyle\sup_{\theta\in\Theta}{\left\lVert\frac{\partial^{q}\tilde{\varphi}_{m,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}-\frac{\partial^{q}\varphi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right\rVert}_{\infty}\xrightarrow[]{m\to\infty}0.

Note that the family {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} satisfies Assumption A.1 if and only if {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1. Further, for any n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, we have that

Σn,θ=[φθ(SiSj)]1i,jn,\displaystyle\Sigma_{n,\theta}=\big{[}\varphi_{\theta}(\left\lVert S_{i}-S_{j}\right\rVert)\big{]}_{1\leq i,j\leq n},

on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}). Thus, a sequence of truncated-ML estimators for θ0\theta_{0} based on {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} is a sequence of truncated-ML estimators for θ0\theta_{0} based on {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\}. If we define a sequence of truncated-ML estimators (θ^n(φ~))n+\big{(}\hat{\theta}_{n}(\tilde{\varphi})\big{)}{}_{n\in\mathbb{N}_{+}} for θ0\theta_{0} based on a given {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}} upon replacing Σ~n,θ\widetilde{\Sigma}_{n,\theta} in (7) with the random n×nn\times n matrix [φ~r(n),θ(SiSj)]1i,jn\big{[}\tilde{\varphi}_{r(n),\theta}(\left\lVert S_{i}-S_{j}\right\rVert)\big{]}_{1\leq i,j\leq n}, we can recover the results of Sections 4 and 5:

Theorem A.1.

Let (θ^n(φ~))n+\big{(}\hat{\theta}_{n}(\tilde{\varphi})\big{)}{}_{n\in\mathbb{N}_{+}} be a sequence of truncated-ML estimators for θ0\theta_{0} based on {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}}. Assume that {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} satisfies Assumption A.1 (regarding (2), q=1q=1 and the continuity of first order partial derivatives is sufficient) and {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 5.1. Suppose further that {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2 (regarding (4) and (5), q=1q=1 and the continuity of first order partial derivatives is sufficient). Then,

θ^n(φ~)nθ0.\displaystyle\hat{\theta}_{n}(\tilde{\varphi})\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0}.
Theorem A.2.

Let (θ^n(φ~))n+\big{(}\hat{\theta}_{n}(\tilde{\varphi})\big{)}{}_{n\in\mathbb{N}_{+}} be an sequence of truncated-ML estimators for θ0\theta_{0} based on {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}}. Suppose that {φθ:θΘ}\{\varphi_{\theta}\colon\theta\in\Theta\} satisfies Assumption A.1 and {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumptions 5.1 and 5.2. Assume further that {(φ~m,θ):θΘ}\big{\{}(\tilde{\varphi}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2. Then, we have that

n1/2(θ^n(φ~)θ0)nd𝒩(0,Λ1),n^{1/2}\big{(}\hat{\theta}_{n}(\tilde{\varphi})-\theta_{0}\big{)}\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

with Λ\Lambda as in Theorem 5.4.

Appendix B Supporting results

Let r:++r\colon\mathbb{N}_{+}\to\mathbb{N}_{+} be such that r(n)r(n)\xrightarrow{}\infty as nn\xrightarrow{}\infty. For the families {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} and {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}, we introduce, for any n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, for an arbitrary s(n)𝒢ns_{(n)}\in\mathcal{G}_{n}, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, the non-random n×nn\times n matrices

qΣθ(s(n))θi1θiq[qcθθi1θiq(sisj)]1i,jn,\displaystyle\frac{\partial^{q}\Sigma_{\theta}(s_{(n)})}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\coloneqq\bigg{[}\frac{\partial^{q}c_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(s_{i}-s_{j})\bigg{]}_{1\leq i,j\leq n},

and

qΣ~θ(s(n))θi1θiq[qc~r(n),θθi1θiq(sisj)]1i,jn,\displaystyle\frac{\partial^{q}\widetilde{\Sigma}_{\theta}(s_{(n)})}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\coloneqq\bigg{[}\frac{\partial^{q}\tilde{c}_{r(n),\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(s_{i}-s_{j})\bigg{]}_{1\leq i,j\leq n},

whenever the above partial derivatives with respect to θ\theta exist. Further, for Borel measurable sequences of functions {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}}, we introduce, on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), the n×nn\times n random matrices

ωqΣn,θθi1θiq(ω)qΣθ(S(n)(ω))θi1θiq,\displaystyle\omega\mapsto\frac{\partial^{q}\Sigma_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(\omega)\coloneqq\frac{\partial^{q}\Sigma_{\theta}(S_{(n)}(\omega))}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}},

and

ωqΣ~n,θθi1θiq(ω)qΣ~θ(S(n)(ω))θi1θiq,\displaystyle\omega\mapsto\frac{\partial^{q}\widetilde{\Sigma}_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(\omega)\coloneqq\frac{\partial^{q}\widetilde{\Sigma}_{\theta}(S_{(n)}(\omega))}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}},

whenever the above partial derivatives with respect to θ\theta exist.

Lemma B.1.

Let CC, L<L<\infty be some real constants. Consider g:d+g\colon\mathbb{R}^{d}\to\mathbb{R}_{+} such that gC(d;S)g\in\mathcal{B}_{\text{C}}(\mathbb{R}^{d};S), with SB[0;C]S\subset\mathrm{B}[0;C] and gL{\left\lVert g\right\rVert}_{\infty}\leq L. Then, for any i+i\in\mathbb{N}_{+}, for any sequence (sj)j+𝒢(s_{j})_{j\in\mathbb{N}_{+}}\in\mathcal{G},

j+g(sisj)LR(d,C,τ),\sum_{j\in\mathbb{N}_{+}}g(s_{i}-s_{j})\leq LR(d,C,\tau), (17)

where R(d,C,τ)(22ddCd1)/ΔτdR(d,C,\tau)\coloneqq(2^{2d}dC^{d-1})/\Delta_{\tau}^{d}, with Δτ=12τ\Delta_{\tau}=1-2\tau. Further, we also have that

j+|vivj|C+1g(sisj)=0.\sum_{\begin{subarray}{c}j\in\mathbb{N}_{+}\\ \left\lvert v_{i}-v_{j}\right\rvert_{\infty}\geq C+1\end{subarray}}g(s_{i}-s_{j})=0. (18)
Remark B.1.

We would like to point out that Lemma B.1 resembles Lemmas D.1 and D.3 of [4], where f:d+f\colon\mathbb{R}^{d}\to\mathbb{R}_{+}, which is such that f(s)1/(1+|s|d+1)f(s)\leq 1/(1+\left\lvert s\right\rvert_{\infty}^{d+1}), is replaced with a compactly supported function gg, defined as in Lemma B.1.

Lemma B.2.

Let C~\widetilde{C}, L~<\widetilde{L}<\infty be some real constants. Consider a sequence of functions (gm)m+(g_{m}){}_{m\in\mathbb{N}_{+}}, with values in +\mathbb{R}_{+}, where for any m+m\in\mathbb{N}_{+}, gmC(d;S~m)g_{m}\in\mathcal{B}_{\text{C}}(\mathbb{R}^{d};\widetilde{S}_{m}), with S~mB[0;C~]\widetilde{S}_{m}\subset\mathrm{B}\big{[}0;\widetilde{C}\big{]} and gmL~{\left\lVert g_{m}\right\rVert}_{\infty}\leq\widetilde{L}. Then, for any i+i\in\mathbb{N}_{+}, for any sequence (sj)j+𝒢(s_{j})_{j\in\mathbb{N}_{+}}\in\mathcal{G},

supm+j+gm(sisj)L~R(d,C~,τ),\sup_{m\in\mathbb{N}_{+}}\sum_{j\in\mathbb{N}_{+}}g_{m}(s_{i}-s_{j})\leq\widetilde{L}R(d,\widetilde{C},\tau), (19)

where R(d,C~,τ)(22ddC~d1)/ΔτdR(d,\widetilde{C},\tau)\coloneqq(2^{2d}d\widetilde{C}^{d-1})/\Delta_{\tau}^{d}, with Δτ=12τ\Delta_{\tau}=1-2\tau. Further we also have that

supm+j+|vivj|C~+1gm(sisj)=0.\sup_{m\in\mathbb{N}_{+}}\sum_{\begin{subarray}{c}j\in\mathbb{N}_{+}\\ \left\lvert v_{i}-v_{j}\right\rvert_{\infty}\geq\widetilde{C}+1\end{subarray}}g_{m}(s_{i}-s_{j})=0. (20)
Lemma B.3.

Assume that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies (1) and (3) of Assumption 3.1. Consider {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} that satisfies (1), (2) and (3) of Assumption 3.2. Then, we have that

supn+sups(n)𝒢nsupθΘΣθ(s(n))2<supn+sups(n)𝒢nsupθΘΣ~θ(s(n))2<,\sup_{n\in\mathbb{N}_{+}}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{\theta}(s_{(n)})\big{\rVert}_{2}<\infty\text{, }\sup_{n\in\mathbb{N}_{+}}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\widetilde{\Sigma}_{\theta}(s_{(n)})\big{\rVert}_{2}<\infty, (21)

and in particular

sups(n)𝒢nsupθΘΣθ(s(n))Σ~θ(s(n))2n0.\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{\theta}(s_{(n)})-\widetilde{\Sigma}_{\theta}(s_{(n)})\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0. (22)

Further, we have that

infn+infs(n)𝒢ninfθΘλn(Σθ(s(n)))>0,\inf_{n\in\mathbb{N}_{+}}\inf_{s_{(n)}\in\mathcal{G}_{n}}\inf_{\theta\in\Theta}\lambda_{n}\big{(}\Sigma_{\theta}(s_{(n)})\big{)}>0, (23)

and there exists N+N\in\mathbb{N}_{+} such that

infnNinfs(n)𝒢ninfθΘλn(Σ~θ(s(n)))>0.\inf_{n\geq N}\inf_{s_{(n)}\in\mathcal{G}_{n}}\inf_{\theta\in\Theta}\lambda_{n}\big{(}\widetilde{\Sigma}_{\theta}(s_{(n)})\big{)}>0. (24)
Corollary B.4.

Let {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\}, {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} and NN be as in Lemma B.3. Then, we have that

supn+sups(n)𝒢nsupθΘΣθ(s(n))12<supnNsups(n)𝒢nsupθΘΣ~θ(s(n))12<.\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{\theta}(s_{(n)})^{-1}\big{\rVert}_{2}<\infty\text{, }\sup_{n\geq N}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\widetilde{\Sigma}_{\theta}(s_{(n)})^{-1}\big{\rVert}_{2}<\infty.

In addition we can conclude that

sups(n)𝒢nsupθΘΣθ(s(n))+Σ~θ(s(n))+2n0.\displaystyle\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{\theta}(s_{(n)})^{+}-\widetilde{\Sigma}_{\theta}(s_{(n)})^{+}\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0.

In particular we have that  a.s.\mathbb{P}\text{ a.s.}

supn+supθΘΣn,θ12,supnNsupθΘΣ~n,θ12< and supθΘΣn,θ+Σ~n,θ+2n0.\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{n,\theta}^{-1}\big{\rVert}_{2},\;\sup_{n\geq N}\sup_{\theta\in\Theta}\big{\lVert}\widetilde{\Sigma}_{n,\theta}^{-1}\big{\rVert}_{2}<\infty\text{ and }\sup_{\theta\in\Theta}\big{\lVert}\Sigma_{n,\theta}^{+}-\widetilde{\Sigma}_{n,\theta}^{+}\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0.
Lemma B.5.

Suppose that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies (2) of Assumption 3.1. Consider {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} that satisfies (1), (4) and (5) of Assumption 3.2. Then, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, we have that (21) and (22) of Lemma B.3 are satisfied with Σθ(s(n))\Sigma_{\theta}(s_{(n)}) and Σ~θ(s(n))\widetilde{\Sigma}_{\theta}(s_{(n)}) replaced with the respective partial derivatives qΣθ(s(n))θi1θiq\frac{\partial^{q}\Sigma_{\theta}(s_{(n)})}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}} and qΣ~θ(s(n))θi1θiq\frac{\partial^{q}\widetilde{\Sigma}_{\theta}(s_{(n)})}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}. In particular, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, we have that  a.s.\mathbb{P}\text{ a.s.}

supn+supθΘqΣn,θθi1θiq2<supn+supθΘqΣ~n,θθi1θiq2<,\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\bigg{\lVert}\frac{\partial^{q}\Sigma_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\bigg{\rVert}_{2}<\infty\text{, }\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\bigg{\lVert}\frac{\partial^{q}\widetilde{\Sigma}_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\bigg{\rVert}_{2}<\infty,

and in addition it is true that for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\},  a.s.\mathbb{P}\text{ a.s.}

supθΘqΣn,θθi1θiqqΣ~n,θθi1θiq2n0.\displaystyle\sup_{\theta\in\Theta}\bigg{\lVert}\frac{\partial^{q}\Sigma_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}-\frac{\partial^{q}\widetilde{\Sigma}_{n,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\bigg{\rVert}_{2}\xrightarrow[]{n\to\infty}0.
Lemma B.6.

Let I+I\in\mathbb{N}_{+} be fixed. On (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), for k=1,,Ik=1,\dotsc,I, we consider a sequence of n×nn\times n random symmetric matrices (A~k,n,θ)n+\big{(}\widetilde{A}_{k,n,\theta}\big{)}{}_{n\in\mathbb{N}_{+}}, θΘ\theta\in\Theta, such that  a.s.\mathbb{P}\text{ a.s.}, for any k=1,,Ik=1,\dotsc,I, supn+supθΘA~k,n,θ<\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\big{\lVert}\widetilde{A}_{k,n,\theta}\big{\rVert}<\infty. Further we assume that there exists N+N\in\mathbb{N}_{+} such that  a.s.\mathbb{P}\text{ a.s.}, for k=1,,Ik=1,\dotsc,I, infnNinfθΘλn(A~k,n,θ)>0\inf_{n\geq N}\inf_{\theta\in\Theta}\lambda_{n}\big{(}\widetilde{A}_{k,n,\theta}\big{)}>0. Let (Ak,n,θ)n+\big{(}A_{k,n,\theta}\big{)}{}_{n\in\mathbb{N}_{+}}, θΘ\theta\in\Theta, k=1,,Ik=1,\dotsc,I, be another sequence of n×nn\times n random symmetric matrices, defined on the same probability space, which is such that  a.s.\mathbb{P}\text{ a.s.}, for k=1,,Ik=1,\dotsc,I,

supn+supθΘAk,n,θ2< and infnNinfθΘλn(Ak,n,θ)>0.\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}{\left\lVert A_{k,n,\theta}\right\rVert}_{2}<\infty\text{ and }\inf_{n\geq N}\inf_{\theta\in\Theta}\lambda_{n}(A_{k,n,\theta})>0.

Finally we also assume that  a.s.\mathbb{P}\text{ a.s.}, for any k=1,,Ik=1,\dotsc,I,

supθΘA~k,n,θAk,n,θ2n0.\displaystyle\sup_{\theta\in\Theta}\big{\lVert}\widetilde{A}_{k,n,\theta}-A_{k,n,\theta}\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0.

Then, we have that  a.s.\mathbb{P}\text{ a.s.}

supθΘ|1nlog(det+(k=1IAk,n,θ))1nlog(det+(k=1IA~k,n,θ))|n0.\sup_{\theta\in\Theta}\left\lvert\frac{1}{n}\operatorname{log}\bigg{(}\operatorname{det}_{+}\bigg{(}\prod_{k=1}^{I}A_{k,n,\theta}\bigg{)}\bigg{)}-\frac{1}{n}\operatorname{log}\bigg{(}\operatorname{det}_{+}\bigg{(}\prod_{k=1}^{I}\widetilde{A}_{k,n,\theta}\bigg{)}\bigg{)}\right\rvert\xrightarrow[]{n\to\infty}0.
Lemma B.7.

On (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), consider two sequences of n×nn\times n random matrices (An,θ)n+\big{(}A_{n,\theta}\big{)}{}_{n\in\mathbb{N}_{+}} and (A~n,θ)n+\big{(}\widetilde{A}_{n,\theta}\big{)}{}_{n\in\mathbb{N}_{+}}, θΘ\theta\in\Theta, such that  a.s.\mathbb{P}\text{ a.s.}

supθΘAn,θA~n,θ2n0.\displaystyle\sup_{\theta\in\Theta}\big{\lVert}A_{n,\theta}-\widetilde{A}_{n,\theta}\big{\rVert}_{2}\xrightarrow[]{n\to\infty}0.

Then, we have that

supθΘ1n|Z(n),An,θZ(n)Z(n),A~n,θZ(n)|n0.\sup_{\theta\in\Theta}\frac{1}{n}\big{\lvert}\langle Z_{(n)},A_{n,\theta}Z_{(n)}\rangle-\langle Z_{(n)},\widetilde{A}_{n,\theta}Z_{(n)}\rangle\big{\rvert}\xrightarrow[n\to\infty]{\mathbb{P}}0. (25)
Lemma B.8.

Suppose that {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} satisfies Assumption 3.1 and 5.2 (regularity conditions for partial derivatives up to order q=2q=2 are sufficient). Suppose further that {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption 3.2 (regularity conditions for partial derivatives up to order q=2q=2 are sufficient). Let NN be as in Proposition 5.1 and define {(Gn,N(θ)):n+θΘ}\big{\{}(G_{n,N}(\theta)){}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}} and {(G~n,N(θ)):n+θΘ}\big{\{}\big{(}\widetilde{G}_{n,N}(\theta)\big{)}{}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}} as in (10) and (11), respectively. We then have that

JG~n,N(θ0)JGn,N(θ0)2n0.\big{\lVert}J_{\widetilde{G}_{n,N}}(\theta_{0})-J_{G_{n,N}}(\theta_{0})\big{\rVert}_{2}\xrightarrow[n\to\infty]{\mathbb{P}}0. (26)

Further, we conclude that the random p×pp\times p matrix JG~n,N(θ0)J_{\widetilde{G}_{n,N}}(\theta_{0}) converges in probability \mathbb{P} to a non-random matrix 2Λ2\Lambda, where Sp×pΛ0.\mathrm{S}_{p\times p}\ni\Lambda\succ 0.

Appendix C Proofs

C.1 Proof of results in Appendix B

Proof of Lemma B.1.

Let (sj)j+𝒢\left(s_{j}\right)_{j\in\mathbb{N}_{+}}\in\mathcal{G}. For j+j\in\mathbb{N}_{+} such that |vivj|C+1\left\lvert v_{i}-v_{j}\right\rvert_{\infty}\geq C+1 we have that |sisj|C\left\lvert s_{i}-s_{j}\right\rvert_{\infty}\geq C and thus sisjC\left\lVert s_{i}-s_{j}\right\rVert\geq C as well (since |w|w\left\lvert w\right\rvert_{\infty}\leq\left\lVert w\right\rVert for any wdw\in\mathbb{R}^{d}). Therefore, (18)(\ref{lemma12}) follows since we have assumed that gg has compact support SB[0;C]S\subset\mathrm{B}\left[0;C\right]. The proof of (17) depends on the fact that there exists a minimal spacing Δτ>0\Delta_{\tau}>0 between any two distinct observation points (see (4)). This allows us to show that for some arbitrary i+i\in\mathbb{N}_{+}, if Nsi,CN_{s_{i},C} denotes the cardinality of the set {j+:sjsiC}{j+:|sjsi|C}\{j\in\mathbb{N}_{+}\colon\left\lVert s_{j}-s_{i}\right\rVert\leq C\}\subset\{j\in\mathbb{N}_{+}\colon\left\lvert s_{j}-s_{i}\right\rvert_{\infty}\leq C\}, we have that Nsi,CR(d,C,τ)N_{s_{i},C}\leq R\left(d,C,\tau\right). For a complete argument one could for example consider the proof of Lemma 4 in [16]. Using this we can estimate,

j+g(sisj)\displaystyle\sum_{j\in\mathbb{N}_{+}}g\left(s_{i}-s_{j}\right) =j+g(sisj)𝟙[0,C](sisj)\displaystyle=\sum_{j\in\mathbb{N}_{+}}g\left(s_{i}-s_{j}\right)\mathbbm{1}_{[0,C]}\left(\left\lVert s_{i}-s_{j}\right\rVert\right)
Lj+𝟙[0,C+1](sisj)\displaystyle\leq L\sum_{j\in\mathbb{N}_{+}}\mathbbm{1}_{[0,C+1]}\left(\left\lVert s_{i}-s_{j}\right\rVert\right)
LR(d,C,τ),\displaystyle\leq LR\left(d,C,\tau\right),

and thus also (17) is proven. ∎

Proof of Lemma B.2.

The proof is similar to the proof of Lemma B.1 and hence we consider the lemma as proven. ∎

Proof of Lemma B.3.

Let CC, LL and C~\widetilde{C}, L~\widetilde{L} be defined as in (1) of Assumption 3.1 and (2) of Assumption 3.2, respectively. We use Lemma B.1 to show that there exists a real constant M>0M>0, which does not depend on n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta such that for any n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n},

max{Σθ(s(n))2,Σ~θ(s(n))2}M.\max\left\{\big{\lVert}\Sigma_{\theta}(s_{(n)})\big{\rVert}_{2},\big{\lVert}\widetilde{\Sigma}_{\theta}(s_{(n)})\big{\rVert}_{2}\right\}\leq M. (27)

To see this, let Cmax{C,C~}C_{*}\coloneqq\max\{C,\widetilde{C}\} and Lmax{L,L~}L_{*}\coloneqq\max\{L,\widetilde{L}\}. Using (1) of Assumption 3.1, we have that for any θΘ\theta\in\Theta, cθC(d;Sθ)c_{\theta}\in\mathcal{B}_{\text{C}}(\mathbb{R}^{d};S_{\theta}), where now SθB[0;C]S_{\theta}\subset\mathrm{B}\left[0;C_{*}\right] and cθL{\left\lVert c_{\theta}\right\rVert}_{\infty}\leq L_{*}, with CC_{*} and LL_{*} finite constants that are independent of n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta. Thus we can write, for any n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta, by Gershgorin circle theorem,

Σθ(s(n))2\displaystyle{\left\lVert\Sigma_{\theta}(s_{(n)})\right\rVert}_{2} maxi=1,,nj=1n|cθ(sisj)|\displaystyle\leq\max_{i=1,\dotsc,n}\sum_{j=1}^{n}\left\lvert c_{\theta}\left(s_{i}-s_{j}\right)\right\rvert
supi+j+|cθ(sisj)|LR(d,C,τ)M,\displaystyle\leq\sup_{i\in\mathbb{N}_{+}}\sum_{j\in\mathbb{N}_{+}}\left\lvert c_{\theta}\left(s_{i}-s_{j}\right)\right\rvert\leq L_{*}R\left(d,C_{*},\tau\right)\eqqcolon M,

under application of Lemma B.1, with R(d,C,τ)=(22ddCd1)/ΔτdR\left(d,C_{*},\tau\right)=(2^{2d}dC_{*}^{d-1})/\Delta_{\tau}^{d}, where Δτ=12τ\Delta_{\tau}=1-2\tau. Note that MM is independent of n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta. Similarly, by (2) of Assumption 3.2 we then use Lemma B.2, together with Gershgorin circle theorem, to show that for any n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta, Σ~θ(s(n))2M\lVert\widetilde{\Sigma}_{\theta}(s_{(n)})\rVert_{2}\leq M as well. This shows (27). Thus, we have established that

supn+sups(n)𝒢nsupθΘΣ~θ(s(n))2M,\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}{\left\lVert\widetilde{\Sigma}_{\theta}(s_{(n)})\right\rVert}_{2}\leq M,

and

supn+sups(n)𝒢nsupθΘΣθ(s(n))2M,\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}{\left\lVert\Sigma_{\theta}(s_{(n)})\right\rVert}_{2}\leq M,

and therefore (21) of Lemma B.3 is verified. It is shown in [4] (Proposition D.4) that because of the increasing-domain setting, where there exists a minimal distance between any two observation points (see (4)), and since (3) of Assumption 3.1 is satisfied,

infn+infx(n)𝒬ninfθΘλn(Σθ(s(n)))>0.\displaystyle\inf_{n\in\mathbb{N}_{+}}\inf_{x_{(n)}\in\mathcal{Q}^{n}}\inf_{\theta\in\Theta}\lambda_{n}\left(\Sigma_{\theta}(s_{(n)})\right)>0.

This shows (23) of Lemma B.3. Using this result, we can fix some δ>0\delta>0 (small enough, independent of n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta), such that for any s(n)𝒢ns_{(n)}\in\mathcal{G}_{n},

0<εδR(d,C,τ)<mina=1a,Σθ(s(n))a.\displaystyle 0<\varepsilon\coloneqq\frac{\delta}{R\left(d,C_{*},\tau\right)}<\min_{\left\lVert a\right\rVert=1}\langle a,\Sigma_{\theta}(s_{(n)})a\rangle.

For the above δ>0\delta>0, we can then find N+N\in\mathbb{N}_{+} such that,

supnNsups(n)𝒢nsupθΘΣθ(s(n))Σ~θ(s(n))2δ.\sup_{n\geq N}\sup_{s_{(n)}\in\mathcal{G}_{n}}\sup_{\theta\in\Theta}{\left\lVert\Sigma_{\theta}(s_{(n)})-\widetilde{\Sigma}_{\theta}(s_{(n)})\right\rVert}_{2}\leq\delta. (28)

This is valid since for the given ε>0\varepsilon>0, by the uniform convergence of (c~r(n),θ)(\tilde{c}_{r(n),\theta}) to cθc_{\theta} (see (3) of Assumption 3.2), we find N+N\in\mathbb{N}_{+} such that for any nNn\geq N, for any s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and 1i,jn1\leq i,j\leq n,

supθΘ|[Σθ(s(n))]i,j[Σ~θ(s(n))]i,j|<ε.\displaystyle\sup_{\theta\in\Theta}\left\lvert\left[\Sigma_{\theta}(s_{(n)})\right]_{i,j}-\left[\widetilde{\Sigma}_{\theta}(s_{(n)})\right]_{i,j}\right\rvert<\varepsilon.

Then, if we define

dsgr(n)(s)supθΘ|(cθc~r(n),θ)(s)|,nN,\displaystyle\mathbb{R}^{d}\ni s\mapsto g_{r(n)}(s)\coloneqq\sup_{\theta\in\Theta}\left\lvert\left(c_{\theta}-\tilde{c}_{r(n),\theta}\right)(s)\right\rvert,\quad n\geq N,

since we have assumed that the families {cθ:θΘ}\{c_{\theta}\colon\theta\in\Theta\} and {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} have compact supports, which belong to B[0;C]\mathrm{B}\left[0;C_{*}\right], we have that gr(n)(s)=0g_{r(n)}(s)=0 for sC\lVert s\rVert\geq C_{*}. Thus, by Gershgorin circle theorem, under application of Lemma B.2, for nNn\geq N and s(n)𝒢ns_{(n)}\in\mathcal{G}_{n},

Σ~θ(s(n))Σθ(s(n))2\displaystyle{\left\lVert\widetilde{\Sigma}_{\theta}(s_{(n)})-\Sigma_{\theta}(s_{(n)})\right\rVert}_{2} maxi=1,,nj=1ngr(n)(sisj)\displaystyle\leq\max_{i=1,\dotsc,n}\sum_{j=1}^{n}g_{r(n)}\left(s_{i}-s_{j}\right)
supi+j+gr(n)(sisj)εR(d,C,τ).\displaystyle\leq\sup_{i\in\mathbb{N}_{+}}\sum_{j\in\mathbb{N}_{+}}g_{r(n)}\left(s_{i}-s_{j}\right)\leq\varepsilon R\left(d,C_{*},\tau\right).

Since εR(d,C,τ)\varepsilon R\left(d,C_{*},\tau\right) is independent of n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta, we can conclude that (28) must be satisfied. Using (28), we have, for nNn\geq N, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta, and for vectors aa such that a=1\left\lVert a\right\rVert=1, that

|a,Σθ(s(n))aa,Σ~θ(s(n))a|\displaystyle\left\lvert\langle a,\Sigma_{\theta}(s_{(n)})a\rangle-\langle a,\widetilde{\Sigma}_{\theta}(s_{(n)})a\rangle\right\rvert =|a,(Σθ(s(n))Σ~θ(s(n)))a|\displaystyle=\left\lvert\langle a,\big{(}\Sigma_{\theta}(s_{(n)})-\widetilde{\Sigma}_{\theta}(s_{(n)})\big{)}a\rangle\right\rvert
Σθ(s(n))Σ~θ(s(n))2δ,\displaystyle\leq{\left\lVert\Sigma_{\theta}(s_{(n)})-\widetilde{\Sigma}_{\theta}(s_{(n)})\right\rVert}_{2}\leq\delta,

under application of the Cauchy–Schwarz inequality. In conclusion we have for vectors aa such that a=1\left\lVert a\right\rVert=1, for nNn\geq N, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta,

mina=1a,Σθ(s(n))aδ\displaystyle\min_{\left\lVert a\right\rVert=1}\langle a,\Sigma_{\theta}(s_{(n)})a\rangle-\delta mina=1a,Σ~θ(s(n))a.\displaystyle\leq\min_{\left\lVert a\right\rVert=1}\langle a,\widetilde{\Sigma}_{\theta}(s_{(n)})a\rangle.

But we know that infnNinfs(n)𝒢ninfθΘmina=1a,Σθ(s(n))a>0\inf_{n\geq N}\inf_{s_{(n)}\in\mathcal{G}_{n}}\inf_{\theta\in\Theta}\min_{\left\lVert a\right\rVert=1}\langle a,\Sigma_{\theta}(s_{(n)})a\rangle>0 and δ>0\delta>0 was chosen small enough (but otherwise arbitrary). Thus, we have also proven (24) of Lemma B.3. Notice that (22) is proven with (28), hence the proof of Lemma B.3 is complete. ∎

Proof of Corollary B.4.

This follows from Lemma B.3. ∎

Proof of Lemma B.5.

We omit a formal argument and argue that one can proof Lemma B.5 using the same way of reasoning as in the proof of Lemma B.3. ∎

Proof of Lemma B.6.

For nNn\geq N (NN as in the statement) and θΘ\theta\in\Theta, we can write  a.s.\mathbb{P}\text{ a.s.}

det+(k=1IA~k,n,θ)=det(A~I,n,θ1/2A~2,n,θ1/2A~1,n,θA~2,n,θ1/2A~I,n,θ1/2B~n,θ),\displaystyle\operatorname{det}_{+}\bigg{(}\prod_{k=1}^{I}\widetilde{A}_{k,n,\theta}\bigg{)}=\operatorname{det}\big{(}\underbrace{\widetilde{A}_{I,n,\theta}^{1/2}\cdots\widetilde{A}_{2,n,\theta}^{1/2}\widetilde{A}_{1,n,\theta}\widetilde{A}_{2,n,\theta}^{1/2}\cdots\widetilde{A}_{I,n,\theta}^{1/2}}_{\eqqcolon\widetilde{B}_{n,\theta}}\big{)},

and

det+(k=1IAk,n,θ)=det(AI,n,θ1/2A2,n,θ1/2A1,n,θA2,n,θ1/2AI,n,θ1/2Bn,θ).\displaystyle\operatorname{det}_{+}\bigg{(}\prod_{k=1}^{I}A_{k,n,\theta}\bigg{)}=\operatorname{det}\big{(}\underbrace{A_{I,n,\theta}^{1/2}\cdots A_{2,n,\theta}^{1/2}A_{1,n,\theta}A_{2,n,\theta}^{1/2}\cdots A_{I,n,\theta}^{1/2}}_{\eqqcolon B_{n,\theta}}\big{)}.

Note that B~n,θ\widetilde{B}_{n,\theta} and Bn,θB_{n,\theta} are random symmetric matrices. Further, for each of the random symmetric matrices

A~I,n,θ1/2,,A~2,n,θ1/2,A~1,n,θ,AI,n,θ1/2,,A2,n,θ1/2,A1,n,θ,\displaystyle\widetilde{A}_{I,n,\theta}^{1/2},\dotsc,\widetilde{A}_{2,n,\theta}^{1/2},\;\widetilde{A}_{1,n,\theta},\;A_{I,n,\theta}^{1/2},\dotsc,A_{2,n,\theta}^{1/2},\;A_{1,n,\theta},

the smallest eigenvalue is strictly greater that zero,  a.s.\mathbb{P}\text{ a.s.}, uniformly in nNn\geq N and θΘ\theta\in\Theta and hence we have that

infnNinfθΘλn(Bn,θ)>0 and infnNinfθΘλn(B~n,θ)>0, a.s.\inf_{n\geq N}\inf_{\theta\in\Theta}\lambda_{n}\left(B_{n,\theta}\right)>0\text{ and }\inf_{n\geq N}\inf_{\theta\in\Theta}\lambda_{n}\left(\widetilde{B}_{n,\theta}\right)>0,\;\;\mathbb{P}\text{ a.s.} (29)

In addition, since  a.s.\mathbb{P}\text{ a.s.} for k=1,,Ik=1,\dotsc,I, by assumption

supnNsupθΘA~k,n,θ2,supnNsupθΘA~k,n,θ2<,\displaystyle\sup_{n\geq N}\sup_{\theta\in\Theta}{\left\lVert\widetilde{A}_{k,n,\theta}\right\rVert}_{2},\;\sup_{n\geq N}\sup_{\theta\in\Theta}{\left\lVert\widetilde{A}_{k,n,\theta}\right\rVert}_{2}<\infty,

and

supθΘA~k,n,θAk,n,θ2n0,\displaystyle\sup_{\theta\in\Theta}{\left\lVert\widetilde{A}_{k,n,\theta}-A_{k,n,\theta}\right\rVert}_{2}\xrightarrow[]{n\to\infty}0,

we also have that

supnNsupθΘB~n,θ2,supnNsupθΘBn,θ2<,\sup_{n\geq N}\sup_{\theta\in\Theta}{\left\lVert\widetilde{B}_{n,\theta}\right\rVert}_{2},\;\sup_{n\geq N}\sup_{\theta\in\Theta}{\left\lVert B_{n,\theta}\right\rVert}_{2}<\infty, (30)

and

supθΘBn,θB~n,θ2n0.\sup_{\theta\in\Theta}{\left\lVert B_{n,\theta}-\widetilde{B}_{n,\theta}\right\rVert}_{2}\xrightarrow[]{n\to\infty}0. (31)

Using (29), (30) and (31), we pick δ>0\delta>0 arbitrary and define

0<εδsupnNsupθΘBn,θ2\displaystyle 0<\varepsilon\coloneqq\frac{\delta}{\sup_{n\geq N}\sup_{\theta\in\Theta}{\left\lVert B_{n,\theta}\right\rVert}_{2}}

such that for some given integer NNN^{*}\geq N,  a.s.\mathbb{P}\text{ a.s.},

supnNsupθΘB~n,θ1Bn,θ12<ε.\displaystyle\sup_{n\geq N^{*}}\sup_{\theta\in\Theta}{\left\lVert\widetilde{B}_{n,\theta}^{-1}-B_{n,\theta}^{-1}\right\rVert}_{2}<\varepsilon.

Now write

1nlog(det(k=1IAk,n,θ)det(k=1IA~k,n,θ))\displaystyle\frac{1}{n}\operatorname{log}\left(\frac{\operatorname{det}(\prod_{k=1}^{I}A_{k,n,\theta})}{\operatorname{det}(\prod_{k=1}^{I}\widetilde{A}_{k,n,\theta})}\right) =1nlog(det(Bn,θB~n,θ1))\displaystyle=\frac{1}{n}\operatorname{log}\left(\operatorname{det}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right)
=1ntr(log(Bn,θB~n,θ1))\displaystyle=\frac{1}{n}\operatorname{tr}\left(\operatorname{log}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right)
=1ni=1nlog(λi(Bn,θB~n,θ1)).\displaystyle=\frac{1}{n}\sum_{i=1}^{n}\operatorname{log}\left(\lambda_{i}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right). (32)

We can then estimate (C.1) from above and below as

log(λn(Bn,θB~n,θ1))1ni=1nlog(λi(Bn,θB~n,θ1))log(λ1(Bn,θB~n,θ1)).\displaystyle\operatorname{log}\left(\lambda_{n}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right)\leq\frac{1}{n}\sum_{i=1}^{n}\operatorname{log}\left(\lambda_{i}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right)\leq\operatorname{log}\left(\lambda_{1}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right)\right).

But for the given ε>0\varepsilon>0, for nNn\geq N^{*}, we have that  a.s.\mathbb{P}\text{ a.s.}

λ1(Bn,θB~n,θ1)\displaystyle\lambda_{1}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right) λ1(Bn,θ)λ1(B~n,θ1)\displaystyle\leq\lambda_{1}\left(B_{n,\theta}\right)\lambda_{1}\left(\widetilde{B}_{n,\theta}^{-1}\right)
=Bn,θ2B~n,θ12\displaystyle=\big{\lVert}B_{n,\theta}\big{\rVert}_{2}\big{\lVert}\widetilde{B}_{n,\theta}^{-1}\big{\rVert}_{2}
Bn,θ2B~n,θ1Bn,θ12+Bn,θ2Bn,θ12\displaystyle\leq\big{\lVert}B_{n,\theta}\big{\rVert}_{2}\big{\lVert}\widetilde{B}_{n,\theta}^{-1}-B_{n,\theta}^{-1}\big{\rVert}_{2}+\big{\lVert}B_{n,\theta}\big{\rVert}_{2}\big{\lVert}B_{n,\theta}^{-1}\big{\rVert}_{2}
1+δ.\displaystyle\leq 1+\delta.

On the other hand, by (31), we also have that for nNn\geq N^{*},  a.s.\mathbb{P}\text{ a.s.}

λn(Bn,θB~n,θ1)\displaystyle\lambda_{n}\left(B_{n,\theta}\widetilde{B}_{n,\theta}^{-1}\right) λn(Bn,θ)λn(B~n,θ1)\displaystyle\geq\lambda_{n}\left(B_{n,\theta}\right)\lambda_{n}\left(\widetilde{B}_{n,\theta}^{-1}\right)
=(min{a:a=1}a,Bn,θa)(min{a:a=1}a,B~n,θ1a)\displaystyle=\left(\min_{\{a\colon\left\lVert a\right\rVert=1\}}\left\langle a,B_{n,\theta}a\right\rangle\right)\left(\min_{\{a\colon\left\lVert a\right\rVert=1\}}\left\langle a,\widetilde{B}_{n,\theta}^{-1}a\right\rangle\right)
(min{a:a=1}a,Bn,θa)(min{a:a=1}a,Bn,θ1aε)\displaystyle\geq\left(\min_{\{a\colon\left\lVert a\right\rVert=1\}}\left\langle a,B_{n,\theta}a\right\rangle\right)\left(\min_{\{a\colon\left\lVert a\right\rVert=1\}}\left\langle a,B_{n,\theta}^{-1}a\right\rangle-\varepsilon\right)
1δ.\displaystyle\geq 1-\delta.

Since δ>0\delta>0 was arbitrary and independent of θΘ\theta\in\Theta, the lemma is proven. ∎

Proof of Lemma B.7.

First, using the Cauchy–Schwarz inequality and the compatibility of the spectral norm with the Euclidean norm, we can estimate  a.s.\mathbb{P}\text{ a.s.}

1n|Z(n),An,θZ(n)Z(n),A~n,θZ(n)|supθΘAn,θA~n,θ2Z(n)2n\displaystyle\frac{1}{n}\left\lvert\langle Z_{(n)},A_{n,\theta}Z_{(n)}\rangle-\langle Z_{(n)},\widetilde{A}_{n,\theta}Z_{(n)}\rangle\right\rvert\leq\sup_{\theta\in\Theta}{\left\lVert A_{n,\theta}-\widetilde{A}_{n,\theta}\right\rVert}_{2}\frac{\left\lVert Z_{(n)}\right\rVert^{2}}{n}

Let us fix some arbitrary ε>0\varepsilon>0 such that for nn large enough we have that  a.s.\mathbb{P}\text{ a.s.},

supθΘAn,θA~n,θ2<ε.\displaystyle\sup_{\theta\in\Theta}{\left\lVert A_{n,\theta}-\widetilde{A}_{n,\theta}\right\rVert}_{2}<\varepsilon.

Then, let δ>0\delta>0 be arbitrary and notice that

(εn1Z(n)2>δ|S(n)=s(n))=(εn1Σθ0(s(n))1/2Vn2>δ),\displaystyle\mathbb{P}\left(\varepsilon n^{-1}\left\lVert Z_{(n)}\right\rVert^{2}>\delta\;\middle|\;S_{(n)}=s_{(n)}\right)=\mathbb{P}\left(\varepsilon n^{-1}\left\lVert\Sigma_{\theta_{0}}(s_{(n)})^{1/2}V_{n}\right\rVert^{2}>\delta\right),

where VnV_{n} is a Gauss vector, defined on (Ω,,)\left(\Omega,\mathcal{F},\mathbb{P}\right), with zero-mean vector and identity covariance matrix. Then, we use Markov’s inequality to estimate

(εn1Σθ0(s(n))1/2Vn2>δ)\displaystyle\mathbb{P}\left(\varepsilon n^{-1}\left\lVert\Sigma_{\theta_{0}}(s_{(n)})^{1/2}V_{n}\right\rVert^{2}>\delta\right) εn1δ1𝔼[Σθ0(s(n))1/2Vn2]\displaystyle\leq\varepsilon n^{-1}\delta^{-1}\mathbb{E}\left[\left\lVert\Sigma_{\theta_{0}}(s_{(n)})^{1/2}V_{n}\right\rVert^{2}\right]
εδ1Σθ0(s(n))1/222,\displaystyle\leq\varepsilon\delta^{-1}{\left\lVert\Sigma_{\theta_{0}}(s_{(n)})^{1/2}\right\rVert}_{2}^{2},

where the latter term is bounded uniformly in s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and n+n\in\mathbb{N}_{+} (see Lemma B.3). Thus we conclude that

sups(n)𝒢n(supθθAn,θA~n,θ2Z(n)2n>δ|S(n)=s(n))n0,\displaystyle\sup_{s_{(n)}\in\mathcal{G}_{n}}\mathbb{P}\left(\sup_{\theta\in\theta}{\left\lVert A_{n,\theta}-\widetilde{A}_{n,\theta}\right\rVert}_{2}\frac{\left\lVert Z_{(n)}\right\rVert^{2}}{n}>\delta\;\middle|\;S_{(n)}=s_{(n)}\right)\xrightarrow[]{n\to\infty}0,

which shows that

supθθAn,θA~n,θ2Z(n)2nn0,\displaystyle\sup_{\theta\in\theta}{\left\lVert A_{n,\theta}-\widetilde{A}_{n,\theta}\right\rVert}_{2}\frac{\left\lVert Z_{(n)}\right\rVert^{2}}{n}\xrightarrow[n\to\infty]{\mathbb{P}}0,

and thus the proof is complete. ∎

Proof of Lemma B.8.

For n+n\in\mathbb{N}_{+}, let h(n)=n+N1h(n)=n+N-1. Then, for k=1,,pk=1,\dotsc,p, we have that  a.s.\mathbb{P}\text{ a.s.}

l~n,Nθk(θ0)=1h(n)(tr(Σ~h(n),θ01Σ~h(n),θ0θk)Z(h(n)),Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Z(h(n))),\displaystyle\begin{split}\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{k}}(\theta_{0})&=\frac{1}{h(n)}\bigg{(}\operatorname{tr}\bigg{(}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\bigg{)}\\ &\quad-\langle Z_{(h(n))},\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}Z_{(h(n))}\rangle\bigg{)},\end{split}

and

𝔼[l~n,Nθk(θ0)|S(h(n))]=1h(n)(tr(Σ~h(n),θ01Σ~h(n),θ0θk)tr(Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Σh(n),θ0)).\begin{split}\mathbb{E}\left[\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{k}}(\theta_{0})\;\middle|\;S_{(h(n))}\right]&=\frac{1}{h(n)}\bigg{(}\operatorname{tr}\bigg{(}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\bigg{)}\\ &\quad-\operatorname{tr}\bigg{(}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}\bigg{)}\bigg{)}.\end{split}

Similar expressions can then be calculated for ln,Nl_{n,N} based on Σh(n),θ0\Sigma_{h(n),\theta_{0}}. We can further calculate, for n+n\in\mathbb{N}_{+}, for 1k,lp1\leq k,l\leq p,  a.s.\mathbb{P}\text{ a.s.},

2l~n,Nθkθl(θ0)=1h(n)tr(A~1,h(n),θ0kl)+1h(n)Z(h(n)),A~2,h(n),θ0klZ(h(n)),\displaystyle\frac{\partial^{2}\tilde{l}_{n,N}}{\partial\theta_{k}\partial\theta_{l}}(\theta_{0})=\frac{1}{h(n)}\operatorname{tr}\left(\widetilde{A}_{1,h(n),\theta_{0}}^{kl}\right)+\frac{1}{h(n)}\langle Z_{(h(n))},\widetilde{A}_{2,h(n),\theta_{0}}^{kl}Z_{(h(n))}\rangle,

where

A~1,h(n),θ0kl\displaystyle\widetilde{A}_{1,h(n),\theta_{0}}^{kl} Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Σ~h(n),θ0θl+Σ~h(n),θ012Σ~h(n),θ0θkθl,\displaystyle\coloneqq-\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{l}}+\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial^{2}\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}\partial\theta_{l}}, (33)

and

A~2,h(n),θ0kl2Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Σ~h(n),θ0θlΣ~h(n),θ01Σ~h(n),θ012Σ~h(n),θ0θkθlΣ~h(n),θ01.\displaystyle\begin{split}\widetilde{A}_{2,h(n),\theta_{0}}^{kl}&\coloneqq 2\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{l}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\\ &\quad-\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial^{2}\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}\partial\theta_{l}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}.\end{split} (34)

In addition, for n+n\in\mathbb{N}_{+}, we also have that  a.s.\mathbb{P}\text{ a.s.},

(𝔼[l~n,Nθl(θ0)|S(h(n))])θk\displaystyle\frac{\partial\left(\mathbb{E}\left[\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{l}}(\theta_{0})\;\middle|\;S_{(h(n))}\right]\right)}{\partial\theta_{k}} =1h(n)(tr(A~1,h(n),θ0kl)+tr(A~2,h(n),θ0klΣh(n),θ0)).\displaystyle=\frac{1}{h(n)}\bigg{(}\operatorname{tr}\left(\widetilde{A}_{1,h(n),\theta_{0}}^{kl}\right)+\operatorname{tr}\left(\widetilde{A}_{2,h(n),\theta_{0}}^{kl}\Sigma_{h(n),\theta_{0}}\right)\bigg{)}.

Again, similar expressions can be obtained for ln,Nl_{n,N} based on Σh(n),θ0\Sigma_{h(n),\theta_{0}}, where for n+n\in\mathbb{N}_{+}, 1k,lp1\leq k,l\leq p, the respective terms A1,h(n),θ0klA_{1,h(n),\theta_{0}}^{kl} and A2,h(n),θ0klA_{2,h(n),\theta_{0}}^{kl} are defined as in (33) and (34), respectively, but Σ~h(n),θ0\widetilde{\Sigma}_{h(n),\theta_{0}} is replaced with Σh(n),θ0\Sigma_{h(n),\theta_{0}}. Then, we have for n+n\in\mathbb{N}_{+}, for k,l=1,,pk,l=1,\dotsc,p,  a.s.\mathbb{P}\text{ a.s.},

|2l~n,Nθkθl(θ0)2ln,Nθkθl(θ0)|1h(n)|tr(A~1,h(n),θ0klA1,h(n),θ0kl)|+1h(n)|Z(h(n)),A~2,h(n),θ0klA2,h(n),θ0klZ(h(n))|1h(n)|tr(A~1,h(n),θ0klA1,h(n),θ0kl)|+A~2,h(n),θ0klA2,h(n),θ0kl2Z(h(n))2h(n).\displaystyle\begin{split}\left\lvert\frac{\partial^{2}\tilde{l}_{n,N}}{\partial\theta_{k}\partial\theta_{l}}(\theta_{0})-\frac{\partial^{2}l_{n,N}}{\partial\theta_{k}\partial\theta_{l}}(\theta_{0})\right\rvert&\leq\frac{1}{h(n)}\left\lvert\operatorname{tr}\left(\widetilde{A}_{1,h(n),\theta_{0}}^{kl}-A_{1,h(n),\theta_{0}}^{kl}\right)\right\rvert\\ &\quad+\frac{1}{h(n)}\left\lvert\langle Z_{(h(n))},\widetilde{A}_{2,h(n),\theta_{0}}^{kl}-A_{2,h(n),\theta_{0}}^{kl}Z_{(h(n))}\rangle\right\rvert\\ &\leq\frac{1}{h(n)}\left\lvert\operatorname{tr}\left(\widetilde{A}_{1,h(n),\theta_{0}}^{kl}-A_{1,h(n),\theta_{0}}^{kl}\right)\right\rvert\\ &\quad+{\left\lVert\widetilde{A}_{2,h(n),\theta_{0}}^{kl}-A_{2,h(n),\theta_{0}}^{kl}\right\rVert}_{2}\frac{\left\lVert Z_{(h(n))}\right\rVert^{2}}{h(n)}.\end{split}

We can apply Lemma B.7 to the sequence of random matrices (A~2,h(n),θ0kl)n+\big{(}\widetilde{A}_{2,h(n),\theta_{0}}^{kl}\big{)}{}_{n\in\mathbb{N}_{+}} and (A2,h(n),θ0kl)n+\big{(}A_{2,h(n),\theta_{0}}^{kl}\big{)}{}_{n\in\mathbb{N}_{+}} to conclude under application of Lemma 4.1 (see also Corollary B.4 and Lemma B.5) that

A~2,h(n),θ0klA2,h(n),θ0kl2Z(h(n))2h(n)n0.\displaystyle{\left\lVert\widetilde{A}_{2,h(n),\theta_{0}}^{kl}-A_{2,h(n),\theta_{0}}^{kl}\right\rVert}_{2}\frac{\left\lVert Z_{(h(n))}\right\rVert^{2}}{h(n)}\xrightarrow[n\to\infty]{\mathbb{P}}0.

We also have  a.s.\mathbb{P}\text{ a.s.}

1n|tr(A~1,h(n),θ0klA1,h(n),θ0kl)|n0,\displaystyle\frac{1}{n}\left\lvert\operatorname{tr}\left(\widetilde{A}_{1,h(n),\theta_{0}}^{kl}-A_{1,h(n),\theta_{0}}^{kl}\right)\right\rvert\xrightarrow[]{n\to\infty}0,

using the triangular inequality, von Neumann’s trace inequality and Lemma 4.1 (see also Corollary B.4 and Lemma B.5). Hence, we have shown that for any k,l=1,,pk,l=1,\dotsc,p

|2l~n,Nθkθl(θ0)2ln,Nθkθl(θ0)|n0.\displaystyle\left\lvert\frac{\partial^{2}\tilde{l}_{n,N}}{\partial\theta_{k}\partial\theta_{l}}(\theta_{0})-\frac{\partial^{2}l_{n,N}}{\partial\theta_{k}\partial\theta_{l}}(\theta_{0})\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0.

In addition, we have that for any k,l=1,,pk,l=1,\dotsc,p,  a.s.\mathbb{P}\text{ a.s.}, the expression

|(E[l~n,Nθl(θ0)|S(h(n))])θk(E[ln,Nθl(θ0)|S(h(n))])θk|\displaystyle\left\lvert\frac{\partial\big{(}\operatorname{E}\big{[}\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{l}}(\theta_{0})\;\big{\lvert}\;S_{(h(n))}\big{]}\big{)}}{\partial\theta_{k}}-\frac{\partial\big{(}\operatorname{E}\big{[}\frac{\partial l_{n,N}}{\partial\theta_{l}}(\theta_{0})\;\big{\lvert}\;S_{(h(n))}\big{]}\big{)}}{\partial\theta_{k}}\right\rvert

is bounded from above by

|tr(A~1,h(n),θ0klA1,h(n),θ0kl)|+|tr((A~2,h(n),θ0klA2,h(n),θ0kl)Σh(n),θ0)|h(n),\displaystyle\frac{\big{\lvert}\operatorname{tr}\big{(}\widetilde{A}_{1,h(n),\theta_{0}}^{kl}-A_{1,h(n),\theta_{0}}^{kl}\big{)}\big{\rvert}+\big{\lvert}\operatorname{tr}\big{(}\big{(}\widetilde{A}_{2,h(n),\theta_{0}}^{kl}-A_{2,h(n),\theta_{0}}^{kl}\big{)}\Sigma_{h(n),\theta_{0}}\big{)}\big{\rvert}}{h(n)},

which again, under application of the triangular inequality, von Neumann’s trace inequality and Lemma 4.1 (see also Corollary B.4 and Lemma B.5), converges to zero  a.s.\mathbb{P}\text{ a.s.} Hence, we have shown that for any k,l=1,,pk,l=1,\dotsc,p,

|(JG~n,N(θ0))kl(JGn,N(θ0))kl|n0,\displaystyle\left\lvert\big{(}J_{\widetilde{G}_{n,N}}(\theta_{0})\big{)}_{kl}-\big{(}J_{G_{n,N}}(\theta_{0})\big{)}_{kl}\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0,

which concludes the proof of (26). Now it is shown in [4] (see Propositions D.77 and D.88 and also consider the proofs of Propositions 3.23.2 and 3.33.3), under application of Lemmas B.1, B.2, 4.1, B.5 and Corollary B.4, that

JGn,N(θ0)n2Λ,\displaystyle J_{G_{n,N}}(\theta_{0})\xrightarrow[n\to\infty]{\mathbb{P}}2\Lambda,

where Λ\Lambda is the  a.s.\mathbb{P}\text{ a.s.} limit of a sequence of p×pp\times p matrices (Hh(n)(θ0))n+\big{(}H_{h(n)}(\theta_{0})\big{)}{}_{n\in\mathbb{N}_{+}} defined as

{[12h(n)tr(Σh(n),θ01Σh(n),θ0θkΣh(n),θ01Σh(n),θ0θl)]1k,lp:n+}.\displaystyle\left\{\left[\frac{1}{2h(n)}\operatorname{tr}\left(\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{k}}\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{l}}\right)\right]_{1\leq k,l\leq p}\colon n\in\mathbb{N}_{+}\right\}.

Further, by Assumption 5.2, it is concluded that the limit Λ\Lambda is such that Λ0\Lambda\succ 0. But then, we use (26) to show that

JG~n,N(θ0)n2Λ,\displaystyle J_{\widetilde{G}_{n,N}}(\theta_{0})\xrightarrow[n\to\infty]{\mathbb{P}}2\Lambda,

as well, which concludes the proof of Lemma B.8. ∎

C.2 Proof of results in Section 4

Proof of Lemma 4.1.

We rely on Lemma B.3 and a proof is evident. ∎

C.3 Proof of results in Section 5

To simplify the notation, we write (θ~n)n+(θ^n(c~))n+\big{(}\tilde{\theta}_{n}\big{)}{}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\hat{\theta}_{n}(\tilde{c})\big{)}{}_{n\in\mathbb{N}_{+}}.

Proof of Proposition 5.1.

The statement is verified as a consequence of Lemmas 4.1B.6 and B.7. ∎

Proof of Theorem 5.2.

Let N+N\in\mathbb{N}_{+} be as in Lemma 4.1 (or Proposition 5.1) and define, for any ωΩ\omega\in\Omega, the sequence (l~n,N(θ)(ω))n+\big{(}\tilde{l}_{n,N}\!\left(\theta\right)\left(\omega\right)\big{)}{}_{n\in\mathbb{N}_{+}} as in (9) of Section 5.1. We note that, under the given assumptions of Theorem 5.2, the first order partial derivatives with respect to θ\theta exist for the sequence (l~n,N(θ)(ω))n+\big{(}\tilde{l}_{n,N}\!\left(\theta\right)\left(\omega\right)\big{)}{}_{n\in\mathbb{N}_{+}}. Then, we define the sequence of estimators (θ~n,N)n+(θ~n+N1)n+\big{(}\tilde{\theta}_{n,N}\big{)}{}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\tilde{\theta}_{n+N-1}\big{)}{}_{n\in\mathbb{N}_{+}}. Therefore θ~n,N\tilde{\theta}_{n,N} minimizes l~n,N(θ)\tilde{l}_{n,N}\!\left(\theta\right)  a.s.\mathbb{P}\text{ a.s.} for any n+n\in\mathbb{N}_{+}. To prove that

θ~nnθ0,\displaystyle\tilde{\theta}_{n}\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0},

it is sufficient to show that

θ~n,Nnθ0.\tilde{\theta}_{n,N}\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0}. (35)

We consider a similar approach as given in [4]. As NN is fixed, we write for n+n\in\mathbb{N}_{+}, h(n)=n+N1h(n)=n+N-1. Under the assumptions of the theorem we have that  a.s.\mathbb{P}\text{ a.s.}

Var(l~n,N(θ)|S(h(n)))n0,\mathrm{Var}\left(\tilde{l}_{n,N}(\theta)\;|\;S_{(h(n))}\right)\xrightarrow[]{n\to\infty}0, (36)

and

maxk=1,,psupθΘ|θkl~n,N(θ)|=O(1) as n.\max_{k=1,\dotsc,p}\sup_{\theta\in\Theta}\left\lvert\frac{\partial}{\partial\theta_{k}}\tilde{l}_{n,N}(\theta)\right\rvert=O_{\mathbb{P}}(1)\text{ as }n\to\infty. (37)

To see it, we remark that  a.s.\mathbb{P}\text{ a.s.} (using Proposition 5.1),

Var(l~n,N(θ)|S(h(n)))=2h(n)2tr(Σ~h(n),θ1Σh(n),θ0Σ~h(n),θ1Σh(n),θ0A~h(n),θ).\displaystyle\mathrm{Var}\left(\tilde{l}_{n,N}(\theta)\;|\;S_{(h(n))}\right)=\frac{2}{h(n)^{2}}\operatorname{tr}\bigg{(}\underbrace{\widetilde{\Sigma}_{h(n),\theta}^{-1}\Sigma_{h(n),\theta_{0}}\widetilde{\Sigma}_{h(n),\theta}^{-1}\Sigma_{h(n),\theta_{0}}}_{\eqqcolon\widetilde{A}_{h(n),\theta}}\bigg{)}.

From here, we can use von Neumann’s trace inequality to show that  a.s.\mathbb{P}\text{ a.s.}

|tr(A~h(n),θ)|h(n)Σ~h(n),θ122Σh(n),θ022.\displaystyle\left\lvert\operatorname{tr}\left(\widetilde{A}_{h(n),\theta}\right)\right\rvert\leq h(n)\bigg{\rVert}\widetilde{\Sigma}_{h(n),\theta}^{-1}\bigg{\rVert}_{2}^{2}\bigg{\rVert}\Sigma_{h(n),\theta_{0}}\bigg{\rVert}_{2}^{2}.

Now, by Lemma 4.1 (and Corollary B.4) we can conclude that there exists a real constant M0>0M_{0}>0, such that for any n+n\in\mathbb{N}_{+},  a.s.\mathbb{P}\text{ a.s.}, Var(l~n,N(θ)|S(h(n)))M0/h(n)\mathrm{Var}\big{(}\tilde{l}_{n,N}(\theta)\;|\;S_{(h(n))}\big{)}\leq M_{0}/h(n), which proofs (36). For (37), we first notice that by Lemma B.5, there exist constants M1M_{1}, M2>0M_{2}>0 (which are independent of n+n\in\mathbb{N}_{+}, s(n)𝒢ns_{(n)}\in\mathcal{G}_{n} and θΘ\theta\in\Theta) such that  a.s.\mathbb{P}\text{ a.s.}

supθΘΣ~n,θ12<M1 and maxk=1,,psupθΘθkΣ~n,θ2M2.\displaystyle\sup_{\theta\in\Theta}{\left\lVert\widetilde{\Sigma}_{n,\theta}^{-1}\right\rVert}_{2}<M_{1}\text{ and }\max_{k=1,\dotsc,p}\sup_{\theta\in\Theta}{\left\lVert\frac{\partial}{\partial\theta_{k}}\widetilde{\Sigma}_{n,\theta}\right\rVert}_{2}\leq M_{2}.

Using this result have that  a.s.\mathbb{P}\text{ a.s.}

maxk=1,,psupθΘ|θkl~n,N(θ)|=maxk=1,,psupθΘ|1h(n)tr(Σ~h(n),θ1θkΣ~h(n),θ)1h(n)Z(h(n)),Σ~h(n),θ1θkΣ~h(n),θΣ~h(n),θ1Z(h(n))|M1M2+M12M2Z(h(n))2h(n).\displaystyle\begin{split}\max_{k=1,\dotsc,p}\sup_{\theta\in\Theta}\left\lvert\frac{\partial}{\partial\theta_{k}}\tilde{l}_{n,N}(\theta)\right\rvert&=\max_{k=1,\dotsc,p}\sup_{\theta\in\Theta}\bigg{|}\frac{1}{h(n)}\operatorname{tr}\left(\widetilde{\Sigma}_{h(n),\theta}^{-1}\frac{\partial}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta}\right)\\ &\quad-\frac{1}{h(n)}\langle Z_{(h(n))},\widetilde{\Sigma}_{h(n),\theta}^{-1}\frac{\partial}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta}\widetilde{\Sigma}_{h(n),\theta}^{-1}Z_{(h(n))}\rangle\bigg{|}\\ &\leq M_{1}M_{2}+M_{1}^{2}M_{2}\frac{\left\lVert Z_{(h(n))}\right\rVert^{2}}{h(n)}.\end{split} (38)

Let Vh(n)V_{h(n)} be a Gauss vector on (Ω,,)(\Omega,\mathcal{F},\mathbb{P}), with zero-mean vector and h(n)×h(n)h(n)\times h(n) identity covariance matrix. Then (see also Remark 2.1), for any finite M>0M>0, we have that the probability

(M1M2+M12M2h(n)1Z(h(n))2>MS(h(n))=s(h(n)))\displaystyle\mathbb{P}\big{(}M_{1}M_{2}+M_{1}^{2}M_{2}h(n)^{-1}\left\lVert Z_{(h(n))}\right\rVert^{2}>M\mid S_{(h(n))}=s_{(h(n))}\big{)}

is bounded from above by

(M1M2(1+h(n)1Vh(n)2)>M).\displaystyle\mathbb{P}\big{(}M_{1}M_{2}\big{(}1+h(n)^{-1}\lVert V_{h(n)}\rVert^{2}\big{)}>M\big{)}.

Therefore,  a.s.\mathbb{P}\text{ a.s.},

(M1M2+M12M2h(n)1Z(h(n))2>MS(h(n)))\displaystyle\mathbb{P}\big{(}M_{1}M_{2}+M_{1}^{2}M_{2}h(n)^{-1}\left\lVert Z_{(h(n))}\right\rVert^{2}>M\mid S_{(h(n))}\big{)}

is bounded from above by (M1M2(1+h(n)1Vh(n)2)>M)\mathbb{P}\big{(}M_{1}M_{2}\big{(}1+h(n)^{-1}\lVert V_{h(n)}\rVert^{2}\big{)}>M\big{)} as well. Since M1M2(1+h(n)1Vh(n)2)=O(1) as nM_{1}M_{2}\big{(}1+h(n)^{-1}\lVert V_{h(n)}\rVert^{2}\big{)}=O_{\mathbb{P}}(1)\text{ as }n\to\infty, (37) is shown.

Notice further that Θ\Theta is convex, θl~n,N(θ)\theta\mapsto\tilde{l}_{n,N}(\theta) is continuously differentiable and by (38)

supn+𝔼[maxk=1,,psupθΘ|θkl~n,N(θ)|]<.\displaystyle\sup_{n\in\mathbb{N}_{+}}\mathbb{E}\bigg{[}\max_{k=1,\dotsc,p}\sup_{\theta\in\Theta}\left\lvert\frac{\partial}{\partial\theta_{k}}\tilde{l}_{n,N}(\theta)\right\rvert\bigg{]}<\infty.

Thus, under application of Corollary 2.22.2 of [27], with (36) and (37), we can conclude that

supθΘ|l~n,N(θ)𝔼[l~n,N(θ)|S(h(n))]|n0.\sup_{\theta\in\Theta}\left\lvert\tilde{l}_{n,N}(\theta)-\mathbb{E}\left[\tilde{l}_{n,N}(\theta)\;|\;S_{(h(n))}\right]\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0. (39)

To continue, we define the sequences of random variables

(Dh(n),θ,θ0)n+(𝔼[ln,N(θ)|S(h(n))]𝔼[ln,N(θ0)|S(h(n))])n+,\displaystyle\big{(}D_{h(n),\theta,\theta_{0}}\big{)}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\mathbb{E}\big{[}l_{n,N}(\theta)\;|\;S_{(h(n))}\big{]}-\mathbb{E}\big{[}l_{n,N}(\theta_{0})\;|\;S_{(h(n))}\big{]}\big{)}_{n\in\mathbb{N}_{+}},
(D~h(n),θ,θ0)n+(𝔼[l~n,N(θ)|S(h(n))]𝔼[l~n,N(θ0)|S(h(n))])n+.\displaystyle\big{(}\widetilde{D}_{h(n),\theta,\theta_{0}}\big{)}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\mathbb{E}\big{[}\tilde{l}_{n,N}(\theta)\;|\;S_{(h(n))}\big{]}-\mathbb{E}\big{[}\tilde{l}_{n,N}(\theta_{0})\;|\;S_{(h(n))}\big{]}\big{)}_{n\in\mathbb{N}_{+}}.

For any n+n\in\mathbb{N}_{+}, we have that  a.s.\mathbb{P}\text{ a.s.},

Dh(n),θ,θ0=1h(n)log(det(Σh(n),θ))+1h(n)tr(Σh(n),θ1Σh(n),θ0)1h(n)log(det(Σh(n),θ0))1h(n)tr(Σh(n),θ01Σh(n),θ0).\begin{split}D_{h(n),\theta,\theta_{0}}&=\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\Sigma_{h(n),\theta}\right)\right)+\frac{1}{h(n)}\operatorname{tr}\left(\Sigma_{h(n),\theta}^{-1}\Sigma_{h(n),\theta_{0}}\right)\\ &\quad-\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\Sigma_{h(n),\theta_{0}}\right)\right)-\frac{1}{h(n)}\operatorname{tr}\left(\Sigma_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}\right).\end{split}

Similarly, For any n+n\in\mathbb{N}_{+}, we have that  a.s.\mathbb{P}\text{ a.s.}

D~h(n),θ,θ0=1h(n)log(det(Σ~h(n),θ))+1h(n)tr(Σ~h(n),θ1Σh(n),θ0)1h(n)log(det(Σ~h(n),θ0))1h(n)tr(Σ~h(n),θ01Σh(n),θ0).\begin{split}\widetilde{D}_{h(n),\theta,\theta_{0}}&=\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\widetilde{\Sigma}_{h(n),\theta}\right)\right)+\frac{1}{h(n)}\operatorname{tr}\left(\widetilde{\Sigma}_{h(n),\theta}^{-1}\Sigma_{h(n),\theta_{0}}\right)\\ &\quad-\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\widetilde{\Sigma}_{h(n),\theta_{0}}\right)\right)-\frac{1}{h(n)}\operatorname{tr}\left(\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}\right).\end{split}

Notice that because of (39) we have that

supθΘ|(l~n,N(θ)l~n,N(θ0))D~h(n),θ,θ0|n0.\sup_{\theta\in\Theta}\left\lvert\left(\tilde{l}_{n,N}(\theta)-\tilde{l}_{n,N}(\theta_{0})\right)-\widetilde{D}_{h(n),\theta,\theta_{0}}\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0.

Further, it is shown in [4] (see the proof of Proposition 3.13.1) that under application of Lemma B.3 there exists some constant B>0B>0 (which does not depend on n+n\in\mathbb{N}_{+}) such that  a.s.\mathbb{P}\text{ a.s.}

Dh(n),θ,θ0B1h(n)i,j=1h(n)(cθ(SiSj)cθ0(SiSj))2D2,h(n),θ,θ0.\displaystyle D_{h(n),\theta,\theta_{0}}\geq B\underbrace{\frac{1}{h(n)}\sum_{i,j=1}^{h(n)}\bigg{(}c_{\theta}(S_{i}-S_{j})-c_{\theta_{0}}(S_{i}-S_{j})\bigg{)}^{2}}_{\eqqcolon D_{2,h(n),\theta,\theta_{0}}}.

Under application of Lemmas B.1, B.2, B.3 and Corollary B.4, it is then shown in the proof of Proposition 3.13.1 of [4] that either, if τ=0\tau=0, D2,h(n),θ,θ0D_{2,h(n),\theta,\theta_{0}} is deterministic and we have that

supθΘ|D2,h(n),θ,θ0D,θ,θ0|n0,\displaystyle\sup_{\theta\in\Theta}\left\lvert D_{2,h(n),\theta,\theta_{0}}-D_{\infty,\theta,\theta_{0}}\right\rvert\xrightarrow[]{n\to\infty}0,

where the limit is given by D,θ,θ0=zd(cθ(z)cθ0(z))2D_{\infty,\theta,\theta_{0}}=\sum_{z\in\mathbb{Z}^{d}}\big{(}c_{\theta}(z)-c_{\theta_{0}}(z)\big{)}^{2}. Or τ>0\tau>0 and it is concluded that

supθΘ|D2,h(n),θ,θ0D,θ,θ0|n0,\displaystyle\sup_{\theta\in\Theta}\left\lvert D_{2,h(n),\theta,\theta_{0}}-D_{\infty,\theta,\theta_{0}}\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0,

where in this case

D,θ,θ0=Dτ(cθ(s)cθ0(s))2f(s)𝑑s+(cθ(0)cθ0(0))2.\displaystyle D_{\infty,\theta,\theta_{0}}=\int_{D_{\tau}}\!\big{(}c_{\theta}(s)-c_{\theta_{0}}(s)\big{)}^{2}f(s)ds+\big{(}c_{\theta}(0)-c_{\theta_{0}}(0)\big{)}^{2}.

Notice that because of the assumption that (Xi)i+(X_{i})_{i\in\mathbb{N}_{+}} is independent with common law that has a strictly positive probability density function, the function ff is strictly positive almost everywhere with respect to the Lebesgue measure on DτD_{\tau} (see the end of the proof of Proposition 3.13.1 in [4]). In either case, we can thus conclude that

supθΘ|D2,h(n),θ,θ0D,θ,θ0|n0,\displaystyle\sup_{\theta\in\Theta}\left\lvert D_{2,h(n),\theta,\theta_{0}}-D_{\infty,\theta,\theta_{0}}\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0,

where for any α>0\alpha>0, because of Assumption 5.1, infθ:|θθ0|αD,θ,θ0>0\inf_{\theta\colon\left\lvert\theta-\theta_{0}\right\rvert\geq\alpha}D_{\infty,\theta,\theta_{0}}>0, and the limit D,θ,θ0D_{\infty,\theta,\theta_{0}} is deterministic. We now want to show that there exists some N2NN_{2}\geq N such that for any nN2n\geq N_{2}, for any θΘ\theta\in\Theta,  a.s.\mathbb{P}\text{ a.s.},

D~h(n),θ,θ0BD2,h(n),θ,θ0,\widetilde{D}_{h(n),\theta,\theta_{0}}\geq BD_{2,h(n),\theta,\theta_{0}}, (40)

as well. In this case, with D2,h(n),θ,θ0D_{2,h(n),\theta,\theta_{0}} a random function on Ω\Omega and D,θ,θ0D_{\infty,\theta,\theta_{0}} a deterministic function of θΘ\theta\in\Theta, we would have for any fixed τ0\tau\geq 0, and for any given α>0\alpha>0,

supθΘ|D2,h(n),θ,θ0D,θ,θ0|n0,\displaystyle\sup_{\theta\in\Theta}\left\lvert D_{2,h(n),\theta,\theta_{0}}-D_{\infty,\theta,\theta_{0}}\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0,
infθ:|θθ0|αD,θ,θ0>D,θ0,θ0=0,\displaystyle\inf_{\theta\colon\left\lvert\theta-\theta_{0}\right\rvert\geq\alpha}D_{\infty,\theta,\theta_{0}}>D_{\infty,\theta_{0},\theta_{0}}=0,

where the sequence of estimators (θ~n,N)n+\big{(}\tilde{\theta}_{n,N}\big{)}{}_{n\in\mathbb{N}_{+}}, is such that for nN2n\geq N_{2},  a.s.\mathbb{P}\text{ a.s.},

D2,h(n),θ~n,N,θ0\displaystyle D_{2,h(n),\tilde{\theta}_{n,N},\theta_{0}}
=D2,h(n),θ~n,N,θ01B(l~n,N(θ~n,N)l~n,N(θ0))+1B(l~n,N(θ~n,N)l~n,N(θ0))\displaystyle=D_{2,h(n),\tilde{\theta}_{n,N},\theta_{0}}-\frac{1}{B}\big{(}\tilde{l}_{n,N}\big{(}\tilde{\theta}_{n,N}\big{)}-\tilde{l}_{n,N}\big{(}\theta_{0}\big{)}\big{)}+\frac{1}{B}\big{(}\tilde{l}_{n,N}\big{(}\tilde{\theta}_{n,N}\big{)}-\tilde{l}_{n,N}\big{(}\theta_{0}\big{)}\big{)}
D~h(n),θ~n,N,θ0B1B(l~n,N(θ~n,N)l~n,N(θ0))+1B(l~n,N(θ~n,N)l~n,N(θ0))\displaystyle\leq\frac{\widetilde{D}_{h(n),\tilde{\theta}_{n,N},\theta_{0}}}{B}-\frac{1}{B}\big{(}\tilde{l}_{n,N}\big{(}\tilde{\theta}_{n,N}\big{)}-\tilde{l}_{n,N}\big{(}\theta_{0}\big{)}\big{)}+\frac{1}{B}\big{(}\tilde{l}_{n,N}\big{(}\tilde{\theta}_{n,N}\big{)}-\tilde{l}_{n,N}\big{(}\theta_{0}\big{)}\big{)}
D2,h(n),θ0,θ0+1BsupθΘ|(l~n,N(θ)l~n,N(θ0))D~h(n),θ,θ0|n 0,\displaystyle\leq D_{2,h(n),\theta_{0},\theta_{0}}+\underbrace{\frac{1}{B}\sup_{\theta\in\Theta}\left\lvert\big{(}\tilde{l}_{n,N}(\theta)-\tilde{l}_{n,N}(\theta_{0})\big{)}-\widetilde{D}_{h(n),\theta,\theta_{0}}\right\rvert}_{\xrightarrow[n\to\infty]{\mathbb{P}}\;0},

and we can conclude the proof of Theorem 5.2, using Theorem 5.75.7 of [31]. Hence, it remains to show (40). We write  a.s.\mathbb{P}\text{ a.s.},

|D~h(n),θ,θ0Dh(n),θ,θ0||A~1,h(n),θ,θ0A1,h(n),θ,θ0|+|A~2,h(n),θ,θ0A2,h(n),θ,θ0|+|A~3,h(n),θ,θ0A3,h(n),θ,θ0|,\displaystyle\begin{split}\left\lvert\widetilde{D}_{h(n),\theta,\theta_{0}}-D_{h(n),\theta,\theta_{0}}\right\rvert&\leq\left\lvert\widetilde{A}_{1,h(n),\theta,\theta_{0}}-A_{1,h(n),\theta,\theta_{0}}\right\rvert\\ &\quad+\left\lvert\widetilde{A}_{2,h(n),\theta,\theta_{0}}-A_{2,h(n),\theta,\theta_{0}}\right\rvert\\ &\quad+\left\lvert\widetilde{A}_{3,h(n),\theta,\theta_{0}}-A_{3,h(n),\theta,\theta_{0}}\right\rvert,\end{split}

where

A~1,h(n),θ,θ0A1,h(n),θ,θ0=1h(n)log(det(Σh(n),θ0Σh(n),θ1))1h(n)log(det(Σ~h(n),θ0Σ~h(n),θ1)),\displaystyle\begin{split}\widetilde{A}_{1,h(n),\theta,\theta_{0}}-A_{1,h(n),\theta,\theta_{0}}&=\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\Sigma_{h(n),\theta_{0}}\Sigma_{h(n),\theta}^{-1}\right)\right)\\ &\quad-\frac{1}{h(n)}\operatorname{log}\left(\operatorname{det}\left(\widetilde{\Sigma}_{h(n),\theta_{0}}\widetilde{\Sigma}_{h(n),\theta}^{-1}\right)\right),\end{split}
A~2,h(n),θ,θ0A2,h(n),θ,θ0\displaystyle\widetilde{A}_{2,h(n),\theta,\theta_{0}}-A_{2,h(n),\theta,\theta_{0}} =1h(n)tr([Σ~h(n),θ1Σh(n),θ1]Σh(n),θ0),\displaystyle=\frac{1}{h(n)}\operatorname{tr}\left(\left[\widetilde{\Sigma}_{h(n),\theta}^{-1}-\Sigma_{h(n),\theta}^{-1}\right]\Sigma_{h(n),\theta_{0}}\right),

and

A~3,h(n),θ,θ0A3,h(n),θ,θ0\displaystyle\widetilde{A}_{3,h(n),\theta,\theta_{0}}-A_{3,h(n),\theta,\theta_{0}} =1h(n)tr([Σ~h(n),θ01Σh(n),θ01]Σh(n),θ0).\displaystyle=\frac{1}{h(n)}\operatorname{tr}\left(\left[\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}-\Sigma_{h(n),\theta_{0}}^{-1}\right]\Sigma_{h(n),\theta_{0}}\right).

By Lemma 4.1, Corollary B.4 and Lemma B.6, we already conclude that  a.s.\mathbb{P}\text{ a.s.}

|A~1,h(n),θ,θ0A1,h(n),θ,θ0|\displaystyle\left\lvert\widetilde{A}_{1,h(n),\theta,\theta_{0}}-A_{1,h(n),\theta,\theta_{0}}\right\rvert

converges to zero uniformly in θΘ\theta\in\Theta as nn\xrightarrow{}\infty. Further, we can conclude that  a.s.\mathbb{P}\text{ a.s.}

|A~2,h(n),θ,θ0A2,h(n),θ,θ0|\displaystyle\big{\lvert}\widetilde{A}_{2,h(n),\theta,\theta_{0}}-A_{2,h(n),\theta,\theta_{0}}\big{\rvert} Σ~h(n),θ1Σh(n),θ12Σh(n),θ02,\displaystyle\leq\big{\lVert}\widetilde{\Sigma}_{h(n),\theta}^{-1}-\Sigma_{h(n),\theta}^{-1}\big{\rVert}_{2}\big{\lVert}\Sigma_{h(n),\theta_{0}}\big{\rVert}_{2},

and thus, since by Lemma 4.1  a.s.\mathbb{P}\text{ a.s.} Σh(n),θ02\lVert\Sigma_{h(n),\theta_{0}}\rVert_{2}, Σ~h(n),θ2\lVert\widetilde{\Sigma}_{h(n),\theta}\rVert_{2} and Σh(n),θ2\lVert\Sigma_{h(n),\theta}\rVert_{2} are finite, uniformly in n+n\in\mathbb{N}_{+} and θΘ\theta\in\Theta, and  a.s.\mathbb{P}\text{ a.s.}

Σ~h(n),θ1Σh(n),θ12n0,\displaystyle{\left\lVert\widetilde{\Sigma}_{h(n),\theta}^{-1}-\Sigma_{h(n),\theta}^{-1}\right\rVert}_{2}\xrightarrow[]{n\to\infty}0,

uniformly in θΘ\theta\in\Theta, by application of Corollary B.4, we can also see that  a.s.\mathbb{P}\text{ a.s.}

|A~2,h(n),θ,θ0A2,h(n),θ,θ0|\displaystyle\left\lvert\widetilde{A}_{2,h(n),\theta,\theta_{0}}-A_{2,h(n),\theta,\theta_{0}}\right\rvert

converges to zero as nn\xrightarrow{}\infty, uniformly in θΘ\theta\in\Theta. Using a similar argument we can also show that  a.s.\mathbb{P}\text{ a.s.} the term |A~3,h(n),θ,θ0A3,h(n),θ,θ0|\left\lvert\widetilde{A}_{3,h(n),\theta,\theta_{0}}-A_{3,h(n),\theta,\theta_{0}}\right\rvert converges to zero as nn\to\infty, uniformly in θΘ\theta\in\Theta. Hence, we have shown that  a.s.\mathbb{P}\text{ a.s.}

|D~h(n),θ,θ0Dh(n),θ,θ0|n0,\displaystyle\left\lvert\widetilde{D}_{h(n),\theta,\theta_{0}}-D_{h(n),\theta,\theta_{0}}\right\rvert\xrightarrow[]{n\to\infty}0,

uniformly in θΘ\theta\in\Theta and we can argue that  a.s.\mathbb{P}\text{ a.s.} there exists some N2NN_{2}\geq N such that for all nN2n\geq N_{2}, D~h(n),θ,θ0BD2,h(n),θ,θ0\widetilde{D}_{h(n),\theta,\theta_{0}}\geq BD_{2,h(n),\theta,\theta_{0}} on Ω\Omega, which shows (40). Therefore, we have that

θ~n,Nnθ0,\displaystyle\tilde{\theta}_{n,N}\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0},

which concludes the proof. ∎

Proof of Corollary 5.3.

This follows from Theorem 5.2 when we define, for any θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, c~m,θ(s)cθ(s)\tilde{c}_{m,\theta}(s)\coloneqq c_{\theta}(s), for all sds\in\mathbb{R}^{d}. ∎

Proof of Theorem 5.4.

Let N+N\in\mathbb{N}_{+} be as in Proposition 5.1 and define, for any ωΩ\omega\in\Omega, the sequences of functions (ln,N(θ)(ω))n+\big{(}l_{n,N}\!\left(\theta\right)\left(\omega\right)\big{)}{}_{n\in\mathbb{N}_{+}} and (l~n,N(θ)(ω))n+\big{(}\tilde{l}_{n,N}\!\left(\theta\right)\left(\omega\right)\big{)}{}_{n\in\mathbb{N}_{+}} as in (8) and (9) of Section 5.1, respectively. From the proof of Theorem 5.2 we know that sequence of estimators (θ~n,N)n+(θ~n+N1)n+\big{(}\tilde{\theta}_{n,N}\big{)}{}_{n\in\mathbb{N}_{+}}\coloneqq\big{(}\tilde{\theta}_{n+N-1}\big{)}{}_{n\in\mathbb{N}_{+}}, is such that

θ~n,Nnθ0.\displaystyle\tilde{\theta}_{n,N}\xrightarrow[n\to\infty]{\mathbb{P}}\theta_{0}.

Define {(Gn,N(θ)):n+θΘ}\big{\{}(G_{n,N}\left(\theta\right)){}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}} and {(G~n,N(θ)):n+θΘ}\big{\{}(\widetilde{G}_{n,N}\left(\theta\right)){}_{n\in\mathbb{N}_{+}}\colon\theta\in\Theta\big{\}} as in (10) and (11) respectively. For n+n\in\mathbb{N}_{+} we set h(n)=n+N1h(n)=n+N-1. We have for k=1,,pk=1,\dotsc,p,  a.s.\mathbb{P}\text{ a.s.}, for n+n\in\mathbb{N}_{+},

c~k,n,N(θ0)=l~n,Nθk(θ)𝔼[l~n,Nθk(θ)|S(h(n))]=1h(n)tr(Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Σh(n),θ0M~k,h(n))+1h(n)Z(h(n)),Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01N~k,h(n)Z(h(n)),\displaystyle\begin{split}\tilde{c}_{k,n,N}(\theta_{0})&=\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{k}}(\theta)-\mathbb{E}\left[\frac{\partial\tilde{l}_{n,N}}{\partial\theta_{k}}(\theta)\;\middle|\;S_{(h(n))}\right]\\ &=\frac{1}{h(n)}\operatorname{tr}\big{(}\underbrace{\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}}_{\eqqcolon\widetilde{M}_{k,h(n)}}\big{)}\\ &\quad+\frac{1}{h(n)}\langle Z_{(h(n))},\underbrace{-\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}}_{\eqqcolon\widetilde{N}_{k,h(n)}}Z_{(h(n))}\rangle,\end{split}

where, by Lemma 4.1 (see also Corollary B.4 and Lemma B.5)  a.s.\mathbb{P}\text{ a.s.}, for n+n\in\mathbb{N}_{+}, M~k,h(n)2\lVert\widetilde{M}_{k,h(n)}\rVert_{2} and N~k,h(n)2\lVert\widetilde{N}_{k,h(n)}\rVert_{2} are finite, uniformly in θΘ\theta\in\Theta. Further, notice that  a.s.\mathbb{P}\text{ a.s.}

tr(M~k,h(n)+N~k,h(n)Σh(n),θ0)=0k{1,,p}.\displaystyle\operatorname{tr}\left(\widetilde{M}_{k,h(n)}+\widetilde{N}_{k,h(n)}\Sigma_{h(n),\theta_{0}}\right)=0\quad\forall\,k\in\{1,\dotsc,p\}.

From the proof of Lemma B.8, we already know that  a.s.\mathbb{P}\text{ a.s.}, Hh(n)(θ0)nΛH_{h(n)}(\theta_{0})\xrightarrow[]{n\to\infty}\Lambda, where for any k,l=1,,pk,l=1,\dotsc,p, for any n+n\in\mathbb{N}_{+},

[Hh(n)(θ0)]kl=12h(n)tr(Σh(n),θ01Σh(n),θ0θkΣh(n),θ01Σh(n),θ0θl).\displaystyle\left[H_{h(n)}(\theta_{0})\right]_{kl}=\frac{1}{2h(n)}\operatorname{tr}\left(\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{k}}\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{l}}\right).

Now, if we define, on (Ω,,)\left(\Omega,\mathcal{F},\mathbb{P}\right), the sequence of random p×pp\times p matrices

{12h(n)[tr(N~k,h(n)Σh(n),θ0N~{l,h(n)}Σh(n),θ0)]{1k,lp}H~h(n)(θ0):n+}\displaystyle\bigg{\{}\underbrace{\frac{1}{2h(n)}\left[\operatorname{tr}\left(\widetilde{N}_{k,h(n)}\Sigma_{h(n),\theta_{0}}\widetilde{N}_{\left\{l,h(n)\right\}}\Sigma_{h(n),\theta_{0}}\right)\right]_{\{1\leq k,l\leq p\}}}_{\eqqcolon\widetilde{H}_{h(n)}(\theta_{0})}\colon n\in\mathbb{N}_{+}\bigg{\}}

we also have that, for any k,l=1,pk,l=1,\dotsc p,  a.s.\mathbb{P}\text{ a.s.}, |[H~h(n)(θ0)]klΣkl|n0\big{\lvert}\big{[}\widetilde{H}_{h(n)}(\theta_{0})\big{]}_{kl}-\Sigma_{kl}\big{\rvert}\xrightarrow[]{n\to\infty}0. This follows from the fact that for any k,l=1,pk,l=1,\dotsc p, we have that  a.s.\mathbb{P}\text{ a.s.}

|[H~h(n)(θ0)]kl[Hh(n)(θ0)]kl|12h(n)|tr(B~h(n)klBh(n)kl)|,\displaystyle\left\lvert\left[\widetilde{H}_{h(n)}(\theta_{0})\right]_{kl}-\left[H_{h(n)}(\theta_{0})\right]_{kl}\right\rvert\leq\frac{1}{2h(n)}\left\lvert\operatorname{tr}\left(\widetilde{B}_{h(n)}^{kl}-B_{h(n)}^{kl}\right)\right\rvert,

where

B~h(n)kl\displaystyle\widetilde{B}_{h(n)}^{kl} =Σ~h(n),θ01Σ~h(n),θ0θkΣ~h(n),θ01Σh(n),θ0Σ~h(n),θ01Σ~h(n),θ0θlΣ~h(n),θ01Σh(n),θ0,\displaystyle=\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{k}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\frac{\partial\widetilde{\Sigma}_{h(n),\theta_{0}}}{\partial\theta_{l}}\widetilde{\Sigma}_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}},

and

Bh(n)kl\displaystyle B_{h(n)}^{kl} =Σh(n),θ01Σh(n),θ0θkΣh(n),θ01Σh(n),θ0Σh(n),θ01Σh(n),θ0θlΣh(n),θ01Σh(n),θ0,\displaystyle=\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{k}}\Sigma_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}}\Sigma_{h(n),\theta_{0}}^{-1}\frac{\partial\Sigma_{h(n),\theta_{0}}}{\partial\theta_{l}}\Sigma_{h(n),\theta_{0}}^{-1}\Sigma_{h(n),\theta_{0}},

and again, under application of the triangular inequality, von Neumann’s trace inequality and Lemma 4.1 (see also Corollary B.4 and Lemma B.5) we thus have that  a.s.\mathbb{P}\text{ a.s.} H~h(n)(θ0)Hh(n)(θ0)2n0\lVert\widetilde{H}_{h(n)}(\theta_{0})-H_{h(n)}(\theta_{0})\rVert_{2}\xrightarrow[]{n\to\infty}0. But Λ\Lambda is the  a.s.\mathbb{P}\text{ a.s.} limit of {Hh(n)(θ0):n+}\big{\{}H_{h(n)}(\theta_{0})\colon n\in\mathbb{N}_{+}\big{\}} and hence we conclude that Λ\Lambda is also the  a.s.\mathbb{P}\text{ a.s.} limit of {H~h(n)(θ0):n+}\big{\{}\widetilde{H}_{h(n)}(\theta_{0})\colon n\in\mathbb{N}_{+}\big{\}}. Then, we can apply Proposition D.9 of [4] to conclude that

h(n)1/2G~n,N(θ0)nd𝒩(0,4Λ).\displaystyle h(n)^{1/2}\widetilde{G}_{n,N}(\theta_{0})\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}\left(0,4\Lambda\right).

Notice that because the family {(c~m,θ):θΘ}\big{\{}(\tilde{c}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption 3.2, we have that for fixed ωΩ\omega\in\Omega, θG~n,N(ω,θ)\theta\mapsto\widetilde{G}_{n,N}(\omega,\theta) is twice differentiable in θ\theta and we can argue exactly as in the proof of Theorem 5.2 to conclude that the sequence

(supθΘmax{1k,l,mp}|(G~m,n,Nθl)θk(θ)|)n+,\displaystyle\left(\sup_{\theta\in\Theta}\max_{\{1\leq k,l,m\leq p\}}\left\lvert\frac{\partial\left(\frac{\partial\widetilde{G}_{m,n,N}}{\partial\theta_{l}}\right)}{\partial\theta_{k}}(\theta)\right\rvert\right)_{n\in\mathbb{N}_{+}},

is bounded in probability \mathbb{P}. In addition, by Lemma B.8, we also have that

JG~n,N(θ0)JGn,N(θ0)2n0.\displaystyle{\left\lVert J_{\widetilde{G}_{n,N}}(\theta_{0})-J_{G_{n,N}}(\theta_{0})\right\rVert}_{2}\xrightarrow[n\to\infty]{\mathbb{P}}0.

Finally, the sequence of estimators (θ~n,N)n+\big{(}\tilde{\theta}_{n,N}\big{)}{}_{n\in\mathbb{N}_{+}} is consistent and such that

(G~n,N(θ~n,N)=0)n1.\displaystyle\mathbb{P}\left(\widetilde{G}_{n,N}\big{(}\tilde{\theta}_{n,N}\big{)}=0\right)\xrightarrow[]{n\to\infty}1.

Thus we conclude, using for example Proposition D.10 in [4] that

h(n)1/2(θ~n,Nθ0)nd𝒩(0,Λ1).\displaystyle h(n)^{1/2}\left(\tilde{\theta}_{n,N}-\theta_{0}\right)\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}).

Since NN was fixed, we can conclude that

n1/2(θ~nθ0)nd𝒩(0,Λ1),\displaystyle n^{1/2}\left(\tilde{\theta}_{n}-\theta_{0}\right)\xrightarrow[n\to\infty]{\mathrm{d}}\mathcal{N}(0,\Lambda^{-1}),

as well. ∎

Proof of Corollary 5.5.

The result follows from Theorem 5.4, a proof is evident when we define the family {(c~m,θ):θΘ}\big{\{}\big{(}\tilde{c}_{m,\theta}\big{)}\colon\theta\in\Theta\big{\}} as in the proof of Corollary 5.3. ∎

C.4 Proof of results in Appendix A

Proof of Theorem A.1.

The proof is similar to the proof of Theorem 5.2. ∎

Proof of Theorem A.2.

The proof is similar to the proof of Theorem 5.4. ∎

C.5 Proof of results in Section 6

Since ν\nu and κ\kappa are assumed to be known, we put cν,κB(2κ,ν+1)c_{\nu,\kappa}\coloneqq\operatorname{B}(2\kappa,\nu+1). We define the function fν,κ(r,u)u(u2r2)κ1(1u)νf_{\nu,\kappa}(r,u)\coloneqq u(u^{2}-r^{2})^{\kappa-1}(1-u)^{\nu}, (r,u)[0,1]×[0,1](r,u)\in[0,1]\times[0,1]. We recall that for r=0r=0,

cν,κ=01fν,κ(0,u)𝑑u.\displaystyle c_{\nu,\kappa}=\int_{0}^{1}f_{\nu,\kappa}(0,u)du.
Proof of Proposition 6.1.

We have already seen that for any θΘ\theta\in\Theta, given known κ>0\kappa>0 and ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa, ϕθ\phi_{\theta} is continuous on +\mathbb{R}_{+}. Further, for any θΘ\theta\in\Theta, ϕθ\phi_{\theta} has compact support Sθ=[0,β][0,βmax]S_{\theta}=[0,\beta]\subset[0,\beta_{\max}] and since κ>4\kappa>4, we can also see that for any δ>0\delta>0, for any t+t\in\mathbb{R}_{+}, ϕθ(t+δ)ϕθ(t)\phi_{\theta}(t+\delta)\leq\phi_{\theta}(t) and thus ϕθ(t)ϕθ(0)=σ2\phi_{\theta}(t)\leq\phi_{\theta}(0)=\sigma^{2}, which implies that ϕθσmax2{\left\lVert\phi_{\theta}\right\rVert}_{\infty}\leq\sigma^{2}_{\max}. Hence, with CβmaxC\coloneqq\beta_{\max} and Lσmax2L\coloneqq\sigma^{2}_{\max}, CC and LL are independent of θΘ\theta\in\Theta and hence we can conclude that (1) of Assumption A.1 is satisfied with C(+;Sθ)\mathcal{B}_{\text{C}}(\mathbb{R}_{+};S_{\theta}) replaced with 𝒞C(+;Sθ)\mathcal{C}_{\text{C}}(\mathbb{R}_{+};S_{\theta}). It is now sufficient to show that for any θΘ\theta\in\Theta, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, there exist constants Cθ(i1,,iq)C_{\theta}(i_{1},\dotsc,i_{q}), Lθ(i1,,iq)<L_{\theta}(i_{1},\dotsc,i_{q})<\infty, such that

qϕθθi1θiq𝒞C(+;Sθ(i1,,iq)),\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};S_{\theta}(i_{1},\dotsc,i_{q})), (41)

where

Sθ(i1,,iq)[0,Cθ(i1,,iq)][0,C(i1,,iq)],qϕθθi1θiqLθ(i1,,iq)L(i1,,iq),\displaystyle\begin{split}S_{\theta}(i_{1},\dotsc,i_{q})\subset\left[0,C_{\theta}(i_{1},\dotsc,i_{q})\right]\subset\left[0,C(i_{1},\dotsc,i_{q})\right],\\ {\left\lVert\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right\rVert}_{\infty}\leq L_{\theta}(i_{1},\dotsc,i_{q})\leq L(i_{1},\dotsc,i_{q}),\end{split} (42)

with C(i1,,iq)C(i_{1},\dotsc,i_{q}), L(i1,,iq)<L(i_{1},\dotsc,i_{q})<\infty, independent of θΘ\theta\in\Theta. This means that in general we need to check the above condition for 2+22+23=142+2^{2}+2^{3}=14 partial derivatives. Let us first focus on the partial derivatives with respect to the range parameter β[βmin,βmax]\beta\in\left[\beta_{\min},\beta_{\max}\right]. For r[0,1]r\in[0,1] we write

ϕν,κ(r)=cν,κ1a(r)b(r)fν,κ(r,u)𝑑u,\displaystyle\phi_{\nu,\kappa}(r)=c_{\nu,\kappa}^{-1}\int_{a(r)}^{b(r)}f_{\nu,\kappa}(r,u)du,

where [0,1]ra(r)=r[0,1]\ni r\mapsto a(r)=r and [0,1]rb(r)1[0,1]\ni r\mapsto b(r)\equiv 1 are continuously differentiable on [0,1][0,1]. To simplify the notation we will put fν,κff_{\nu,\kappa}\coloneqq f. Then, since f:[0,1]×[0,1]f\colon[0,1]\times[0,1]\to\mathbb{R} is continuous and for any u[0,1]u\in[0,1], since κ>2\kappa>2,

fr(r,u)=2r(κ1)u(u2r2)κ2(1u)ν,\displaystyle\frac{\partial f}{\partial r}(r,u)=-2r(\kappa-1)u\left(u^{2}-r^{2}\right)^{\kappa-2}(1-u)^{\nu},

exists and is continuous on the rectangle [0,1]×[0,1][0,1]\times[0,1], we can conclude, using the general Leibniz integral rule, that [0,1]rdϕν,κdr(r)[0,1]\ni r\mapsto\frac{d\phi_{\nu,\kappa}}{dr}\left(r\right) is continuous and given by

dϕν,κdr(r)=f(r,b(r))dbdr(r)=0f(r,a(r))dadr(r)=02r(κ1)cν,κ1a(r)b(r)u(u2r2)κ2(1u)ν𝑑u=cν,κ1ϕν,κ1(r).\displaystyle\begin{split}\frac{d\phi_{\nu,\kappa}}{dr}\left(r\right)&=\underbrace{f(r,b(r))\frac{db}{dr}(r)}_{=0}-\underbrace{f(r,a(r))\frac{da}{dr}(r)}_{=0}\\ &\quad-2r(\kappa-1)c_{\nu,\kappa}^{-1}\underbrace{\int_{a(r)}^{b(r)}u\left(u^{2}-r^{2}\right)^{\kappa-2}(1-u)^{\nu}du}_{=c_{\nu,\kappa-1}\phi_{\nu,\kappa-1}(r)}.\end{split}

Hence, for t[0,β]t\in[0,\beta]

ϕθβ(t)=σ2tβ2dϕν,κdr(tβ)=2t2(κ1)β3cν,κ1cν,κσ2ϕν,κ1(tβ),\frac{\partial\phi_{\theta}}{\partial\beta}(t)=-\sigma^{2}\frac{t}{\beta^{2}}\frac{d\phi_{\nu,\kappa}}{dr}\left(\frac{t}{\beta}\right)=\frac{2t^{2}(\kappa-1)}{\beta^{3}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right), (43)

exists and is continuous as a function of tt. But clearly, as for r[1,)r\in[1,\infty) ϕν,κ\phi_{\nu,\kappa} is zero, we have that for t[β,)t\in[\beta,\infty), ϕθβ(t)\frac{\partial\phi_{\theta}}{\partial\beta}(t) exists as well and is continuous as a function of tt. Hence, for any t+t\in\mathbb{R}_{+}, ϕθβ(t)\frac{\partial\phi_{\theta}}{\partial\beta}(t) exists, is given by (43) and is continuous as a function of tt. Thus, by monotonicity of ϕν,κ1\phi_{\nu,\kappa-1} we define

Lθ(2)2β2(κ1)β3cν,κ1cν,κσ2=2(κ1)βcν,κ1cν,κσ2 and Cθ(2)β,\displaystyle L_{\theta}(2)\coloneqq\frac{2\beta^{2}(\kappa-1)}{\beta^{3}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}=\frac{2(\kappa-1)}{\beta}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}\text{ and }C_{\theta}(2)\coloneqq\beta,

and have that

ϕθβ𝒞C(+;[0,Cθ(2)])ϕθβLθ(2).\displaystyle\frac{\partial\phi_{\theta}}{\partial\beta}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};[0,C_{\theta}(2)])\text{, }{\left\lVert\frac{\partial\phi_{\theta}}{\partial\beta}\right\rVert}_{\infty}\leq L_{\theta}(2).

Further, we find

supθΘLθ(2)2(κ1)βmincν,κ1cν,κσsup2L(2) and supθΘCθ(2)βmaxC(2),\displaystyle\sup_{\theta\in\Theta}L_{\theta}(2)\leq\underbrace{\frac{2(\kappa-1)}{\beta_{\min}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma_{\sup}^{2}}_{\eqqcolon L(2)}\text{ and }\sup_{\theta\in\Theta}C_{\theta}(2)\leq\underbrace{\beta_{\max}}_{\eqqcolon C(2)},

where L(2)L(2) and C(2)C(2) do not depend on θΘ\theta\in\Theta. Since we have assumed that κ>4\kappa>4, we can now repeat the arguments, which led to (43), for another two times, and conclude that for any t+t\in\mathbb{R}_{+}, 2ϕθβ2(t)\frac{\partial^{2}\phi_{\theta}}{\partial\beta^{2}}(t), and 3ϕθβ3(t)\frac{\partial^{3}\phi_{\theta}}{\partial\beta^{3}}(t) exits as well, are given by

2ϕθβ2(t)=q1(t),\frac{\partial^{2}\phi_{\theta}}{\partial\beta^{2}}(t)=q_{1}(t), (44)

with

q1(t)=4t4(κ1)(κ2)β6cν,κ2cν,κσ2ϕν,κ2(tβ)6t2(κ1)β4cν,κ1cν,κσ2ϕν,κ1(tβ)\displaystyle\begin{split}q_{1}(t)&=\frac{4t^{4}(\kappa-1)(\kappa-2)}{\beta^{6}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-2}\left(\frac{t}{\beta}\right)\\ &\quad-\frac{6t^{2}(\kappa-1)}{\beta^{4}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right)\end{split}

and 3ϕθβ3(t)=q2(t)\frac{\partial^{3}\phi_{\theta}}{\partial\beta^{3}}(t)=q_{2}(t), with

q2(t)=8t6(κ1)(κ2)(κ3)β9cν,κ3cν,κσ2ϕν,κ3(tβ)+24t2(κ1)β5cν,κ1cν,κσ2ϕν,κ1(tβ)36t4(κ1)(κ2)β7cν,κ2cν,κσ2ϕν,κ2(tβ),\begin{split}q_{2}(t)&=\frac{8t^{6}(\kappa-1)(\kappa-2)(\kappa-3)}{\beta^{9}}\frac{c_{\nu,\kappa-3}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-3}\left(\frac{t}{\beta}\right)\\ &\quad+\frac{24t^{2}(\kappa-1)}{\beta^{5}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right)\\ &\quad-\frac{36t^{4}(\kappa-1)(\kappa-2)}{\beta^{7}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-2}\left(\frac{t}{\beta}\right),\end{split}

and are both continuous as a function of t+t\in\mathbb{R}_{+}. Therefore, since ϕν,κ1\phi_{\nu,\kappa-1}, ϕν,κ2\phi_{\nu,\kappa-2} and ϕν,κ3\phi_{\nu,\kappa-3} are non-negative and monotonously decreasing, we can define

Lθ(2,2)4β4(κ1)(κ2)β6cν,κ2cν,κσ2,\displaystyle L_{\theta}(2,2)\coloneqq\frac{4\beta^{4}(\kappa-1)(\kappa-2)}{\beta^{6}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}\sigma^{2},

and

Lθ(2,2,2)8β6(κ1)(κ2)(κ3)β9cν,κ3cν,κσ2+24β2(κ1)β5cν,κ1cν,κσ2,\displaystyle L_{\theta}(2,2,2)\coloneqq\frac{8\beta^{6}(\kappa-1)(\kappa-2)(\kappa-3)}{\beta^{9}}\frac{c_{\nu,\kappa-3}}{c_{\nu,\kappa}}\sigma^{2}+\frac{24\beta^{2}(\kappa-1)}{\beta^{5}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2},

as well as Cθ(2,2)=Cθ(2,2,2)βC_{\theta}(2,2)=C_{\theta}(2,2,2)\coloneqq\beta, and have that

2ϕθβ2𝒞C(+;[0,Cθ(2,2)]),2ϕθβ2Lθ(2,2),\displaystyle\frac{\partial^{2}\phi_{\theta}}{\partial\beta^{2}}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};[0,C_{\theta}(2,2)]),\quad{\left\lVert\frac{\partial^{2}\phi_{\theta}}{\partial\beta^{2}}\right\rVert}_{\infty}\leq L_{\theta}(2,2),
3ϕθβ3𝒞C(+;[0,Cθ(2,2,2)]),3ϕθβ3Lθ(2,2,2).\displaystyle\frac{\partial^{3}\phi_{\theta}}{\partial\beta^{3}}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};[0,C_{\theta}(2,2,2)]),\quad{\left\lVert\frac{\partial^{3}\phi_{\theta}}{\partial\beta^{3}}\right\rVert}_{\infty}\leq L_{\theta}(2,2,2).

Then, we also find

supθΘLθ(2,2)4(κ1)(κ2)βmin2cν,κ2cν,κσsup2L(2,2),\displaystyle\sup_{\theta\in\Theta}L_{\theta}(2,2)\leq\frac{4(\kappa-1)(\kappa-2)}{\beta_{\min}^{2}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}\sigma_{\sup}^{2}\eqqcolon L(2,2),

as well as

supθΘLθ(2,2,2)8(κ1)βmin3σsup2cν,κ((κ2)(κ3)cν,κ3+3cν,κ1)L(2,2,2),\displaystyle\sup_{\theta\in\Theta}L_{\theta}(2,2,2)\leq\frac{8(\kappa-1)}{\beta_{\min}^{3}}\frac{\sigma_{\sup}^{2}}{c_{\nu,\kappa}}\left((\kappa-2)(\kappa-3)c_{\nu,\kappa-3}+3c_{\nu,\kappa-1}\right)\eqqcolon L(2,2,2),

and C(2,2)=C(2,2,2)βmax=supθΘCθ(2,2)=supθΘCθ(2,2,2)C(2,2)=C(2,2,2)\coloneqq\beta_{\max}=\sup_{\theta\in\Theta}C_{\theta}(2,2)=\sup_{\theta\in\Theta}C_{\theta}(2,2,2), where L(2,2)L(2,2), L(2,2,2)L(2,2,2) and C(2,2)C(2,2) and C(2,2,2)C(2,2,2) do not depend on θΘ\theta\in\Theta. This then shows that the partial derivatives of ϕθ\phi_{\theta} with respect to the range parameter β\beta exist up to order three and are continuous on +\mathbb{R}_{+} with uniform bounds that do not depend on θΘ\theta\in\Theta and compact supports that are subsets of [0,βmax]\left[0,\beta_{\max}\right]. Let us now focus on the partial derivatives with respect to σ2\sigma^{2}. We can readily see that for t+t\in\mathbb{R}_{+},

ϕθσ2(t)=ϕν,κ(tβ),\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t)=\phi_{\nu,\kappa}\left(\frac{t}{\beta}\right), (45)

and thus with Sθ(1)=[0,β]S_{\theta}(1)=\left[0,\beta\right] and Lθ(1)=1L_{\theta}(1)=1 we can choose L(1)=1L(1)=1 and C(1)=βmaxC(1)=\beta_{\max} such that (41) and (42) are satisfied. Notice that for any t+t\in\mathbb{R}_{+}, both 2ϕθ(σ2)2(t)\frac{\partial^{2}\phi_{\theta}}{\partial\left(\sigma^{2}\right)^{2}}(t) and 3ϕθ(σ2)3(t)\frac{\partial^{3}\phi_{\theta}}{\partial\left(\sigma^{2}\right)^{3}}(t) are zero. Thus, the existence of the desired constants Lθ(2,2)L_{\theta}(2,2), Cθ(2,2)C_{\theta}(2,2) and L(2,2)L(2,2), C(2,2)C(2,2) for 2ϕθ(σ2)2(t)\frac{\partial^{2}\phi_{\theta}}{\partial\left(\sigma^{2}\right)^{2}}(t) and Lθ(2,2,2)L_{\theta}(2,2,2), Cθ(2,2,2)C_{\theta}(2,2,2) and L(2,2,2)L(2,2,2), C(2,2,2)C(2,2,2) for 3ϕθ(σ2)3\frac{\partial^{3}\phi_{\theta}}{\partial\left(\sigma^{2}\right)^{3}}, such that (41) and (42) is satisfied, is clear. Let us now consider the mixed partial derivatives. Using (43) and (45), we have

2ϕθσ2β(t)=2ϕθβσ2(t)=2t2(κ1)β3cν,κ1cν,κϕν,κ1(tβ),\displaystyle\frac{\partial^{2}\phi_{\theta}}{\partial\sigma^{2}\partial\beta}(t)=\frac{\partial^{2}\phi_{\theta}}{\partial\beta\partial\sigma^{2}}(t)=\frac{2t^{2}(\kappa-1)}{\beta^{3}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right),

and thus the existence of constants Lθ(1,2)=Lθ(2,1)L_{\theta}(1,2)=L_{\theta}(2,1), Cθ(1,2)=Cθ(2,1)C_{\theta}(1,2)=C_{\theta}(2,1) and L(1,2)=L(2,1)L(1,2)=L(2,1), C(1,2)=C(2,1)C(1,2)=C(2,1) for 2ϕθσ2β(t)\frac{\partial^{2}\phi_{\theta}}{\partial\sigma^{2}\partial\beta}(t) and 2ϕθβσ2(t)\frac{\partial^{2}\phi_{\theta}}{\partial\beta\partial\sigma^{2}}(t) such that (41) and (42) is satisfied follows with

Lθ(1,2)=2(κ1)βcν,κ1cν,κ,Cθ(1,2)=β,\displaystyle L_{\theta}(1,2)=\frac{2(\kappa-1)}{\beta}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}},\quad C_{\theta}(1,2)=\beta,

and

L(1,2)=2(κ1)βmincν,κ1cν,κ,C(1,2)=βmax.\displaystyle L(1,2)=\frac{2(\kappa-1)}{\beta_{\min}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}},\quad C(1,2)=\beta_{\max}.

Using (43), (44) and (45) we further have that

3ϕθσ2β2(t)=3ϕθβσ2β(t)=3ϕθσ2ββ(t)=1σ2q1(t),\displaystyle\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta^{2}}(t)=\frac{\partial^{3}\phi_{\theta}}{\partial\beta\partial\sigma^{2}\partial\beta}(t)=\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta\partial\beta}(t)=\frac{1}{\sigma^{2}}q_{1}(t),

and thus

3ϕθσ2β2𝒞C(+;Sθ(1,2,2))3ϕθσ2β2Lθ(1,2,2),\displaystyle\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta^{2}}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};S_{\theta}(1,2,2))\text{, }{\left\lVert\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta^{2}}\right\rVert}_{\infty}\leq L_{\theta}(1,2,2),

with Sθ(1,2,2)=Sθ(2,2,1)=Sθ(2,1,2)=[0,β]S_{\theta}(1,2,2)=S_{\theta}(2,2,1)=S_{\theta}(2,1,2)=\left[0,\beta\right], and

Lθ(1,2,2)=Lθ(2,2,1)=Lθ(2,1,2)=4(κ1)(κ2)β2cν,κ2cν,κ.\displaystyle L_{\theta}(1,2,2)=L_{\theta}(2,2,1)=L_{\theta}(2,1,2)=\frac{4(\kappa-1)(\kappa-2)}{\beta^{2}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}.

Further, (42) is satisfied with C(1,2,2)=C(2,2,1)=C(2,1,2)=βmaxC(1,2,2)=C(2,2,1)=C(2,1,2)=\beta_{\max}, and

L(1,2,2)=L(2,2,1)=L(2,1,2)=4(κ1)(κ2)βmin2cν,κ2cν,κ.\displaystyle L(1,2,2)=L(2,2,1)=L(2,1,2)=\frac{4(\kappa-1)(\kappa-2)}{\beta_{\min}^{2}}\frac{c_{\nu,\kappa-2}}{c_{\nu,\kappa}}.

Finally we can notice that

3ϕθβ(σ2)2(t)=3ϕθσ2βσ2(t)=3ϕθσ2σ2β(t)=0,\displaystyle\frac{\partial^{3}\phi_{\theta}}{\partial\beta\partial\left(\sigma^{2}\right)^{2}}(t)=\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta\partial\sigma^{2}}(t)=\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\sigma^{2}\partial\beta}(t)=0,

and hence we can verify the existence of constants

Lθ(1,1,2)=Lθ(1,2,1)=Lθ(2,1,1),Cθ(1,1,2)=Cθ(1,2,1)=Cθ(2,1,1),\displaystyle L_{\theta}(1,1,2)=L_{\theta}(1,2,1)=L_{\theta}(2,1,1),\quad C_{\theta}(1,1,2)=C_{\theta}(1,2,1)=C_{\theta}(2,1,1),

and

L(1,1,2)=L(1,2,1)=L(2,1,1),C(1,1,2)=C(1,2,1)=C(2,1,1),\displaystyle L(1,1,2)=L(1,2,1)=L(2,1,1),\quad C(1,1,2)=C(1,2,1)=C(2,1,1),

for 3ϕθβ(σ2)2\frac{\partial^{3}\phi_{\theta}}{\partial\beta\partial\left(\sigma^{2}\right)^{2}}, 3ϕθσ2βσ2\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\beta\partial\sigma^{2}}, and 3ϕθσ2σ2β\frac{\partial^{3}\phi_{\theta}}{\partial\sigma^{2}\partial\sigma^{2}\partial\beta}, such that (41) and (42) is satisfied. Thus, we have shown that for κ>4\kappa>4, {ϕθ:θΘ}\{\phi_{\theta}\colon\theta\in\Theta\} satisfies (2) of Assumption A.1, where for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, C(+;Sθ(i1,,iq))\mathcal{B}_{\text{C}}(\mathbb{R}_{+};S_{\theta}(i_{1},\dotsc,i_{q})) can be replaced with 𝒞C(+;Sθ(i1,,iq))\mathcal{C}_{\text{C}}(\mathbb{R}_{+};S_{\theta}(i_{1},\dotsc,i_{q})). It now remains to show that (3) of Assumption A.1 is satisfied. We already know, since ϕθΦd\phi_{\theta}\in\Phi_{d}, that wθw_{\theta} is continuous and non-negative definite on d\mathbb{R}^{d}. We write L1(+)L_{1}\left(\mathbb{R}_{+}\right) and L1(d)L_{1}\left(\mathbb{R}^{d}\right) for the spaces of Lebesgue integrable functions on +\mathbb{R}_{+} and d\mathbb{R}^{d}, respectively. Since ttd1ϕθ(t)L1(+)t\mapsto t^{d-1}\phi_{\theta}(t)\in L_{1}\left(\mathbb{R}_{+}\right) we have that wθL1(d)w_{\theta}\in L_{1}\left(\mathbb{R}^{d}\right). Thus we can conclude, using for example Theorems 5.265.26 and 6.186.18 in [33], that for any sds\in\mathbb{R}^{d}, w^θ(s)=dϕθ(s)>0\widehat{w}_{\theta}(s)=\mathcal{F}_{d}\phi_{\theta}\left(\left\lVert s\right\rVert\right)>0, where

dϕθ(t)=t1(d/2)0ϕθ(u)ud/2J(d/2)1(tu)𝑑u,t+,\displaystyle\mathcal{F}_{d}\phi_{\theta}\left(t\right)=t^{1-(d/2)}\int_{0}^{\infty}\phi_{\theta}(u)u^{d/2}J_{(d/2)-1}(tu)du,\quad t\in\mathbb{R}_{+},

with J(d/2)1J_{(d/2)-1} the Bessel function of order (d/2)1(d/2)-1. This also shows sw^θ(s)s\mapsto\widehat{w}_{\theta}(s) is uniformly continuous on d\mathbb{R}^{d}, a member of L1(d)L_{1}\left(\mathbb{R}^{d}\right) and Fourier inversion holds (see for example Theorem 1.11.1 and Corollary 1.261.26 in [29]). It remains to check that Θ×d(θ,s)w^θ(s)\Theta\times\mathbb{R}^{d}\ni\left(\theta,s\right)\mapsto\widehat{w}_{\theta}(s) is continuous. In the present case, where κ>0\kappa>0 and ν(d+1)/2+κ\nu\geq(d+1)/2+\kappa, one has actually already established a closed form representation of w^θ(s)\widehat{w}_{\theta}(s). We can refer to Theorem 2.12.1 in [11] (see also Theorem 11 in [9] for a nice summary and further results) and write for sd{0}s\in\mathbb{R}^{d}\setminus\left\{0\right\},

w^θ(s)(2π)dσ2Lζβd=F21(d+12+κ;d+12+κ+ν2,d+12+κ+ν2+12;(sβ)24),\frac{\widehat{w}_{\theta}(s)}{\left(2\pi\right)^{d}\sigma^{2}L_{\zeta}\beta^{d}}={{}_{1}}F_{2}\bigg{(}\frac{d+1}{2}+\kappa;\frac{d+1}{2}+\kappa+\frac{\nu}{2},\frac{d+1}{2}+\kappa+\frac{\nu}{2}+\frac{1}{2};-\frac{\left(\left\lVert s\right\rVert\beta\right)^{2}}{4}\bigg{)},

where with ζ(ν,κ,d)\zeta\coloneqq\left(\nu,\kappa,d\right), Lζ=KζΓ(κ)/21κB(2κ,ν+1)L_{\zeta}=K^{\zeta}\Gamma(\kappa)/2^{1-\kappa}\operatorname{B}(2\kappa,\nu+1), with

Kζ=2κd+1πd2Γ(ν+1)Γ(2κ+d)Γ(κ+d2)Γ(ν+2(d+12+κ)),\displaystyle K^{\zeta}=\frac{2^{-\kappa-d+1}\pi^{-\frac{d}{2}}\Gamma\left(\nu+1\right)\Gamma\left(2\kappa+d\right)}{\Gamma\left(\kappa+\frac{d}{2}\right)\Gamma\left(\nu+2\left(\frac{d+1}{2}+\kappa\right)\right)},

and for any zz\in\mathbb{R},

F21(a;b,c;z)=k=0(a)kzk(b)k(c)kk!,\displaystyle{{}_{1}}F_{2}\left(a;b,c;z\right)=\sum_{k=0}^{\infty}\frac{\left(a\right)_{k}z^{k}}{\left(b\right)_{k}\left(c\right)_{k}k!},

a special case of the generalized hypergeometric functions F21{{}_{1}}F_{2} (see also [1]), where for k+k\in\mathbb{N}_{+}, (q)k=Γ(q+k)/Γ(q)\left(q\right)_{k}=\Gamma(q+k)/\Gamma(q) denotes the Pochhammer symbol. Note that for zz\in\mathbb{R}, |z|1\left\lvert z\right\rvert\geq 1 (z1z\neq 1), F21(a;b,c;z){{}_{1}}F_{2}\left(a;b,c;z\right) is defined via its analytic continuation. Since we know that sw^θ(s)s\mapsto\widehat{w}_{\theta}(s) is continuous on the entire d\mathbb{R}^{d} and

zF21(d+12+κ;d+12+κ+ν2,d+12+κ+ν2+12;z),\displaystyle z\mapsto{{}_{1}}F_{2}\left(\frac{d+1}{2}+\kappa;\frac{d+1}{2}+\kappa+\frac{\nu}{2},\frac{d+1}{2}+\kappa+\frac{\nu}{2}+\frac{1}{2};z\right),

is continuous in 0, we can further note that

w^θ(0)\displaystyle\widehat{w}_{\theta}(0) =(2π)dσ2LζβdF21(d+12+κ;d+12+κ+ν2,d+12+κ+ν2+12;0)\displaystyle=\left(2\pi\right)^{d}\sigma^{2}L_{\zeta}\beta^{d}{{}_{1}}F_{2}\left(\frac{d+1}{2}+\kappa;\frac{d+1}{2}+\kappa+\frac{\nu}{2},\frac{d+1}{2}+\kappa+\frac{\nu}{2}+\frac{1}{2};0\right)
=(2π)dσ2Lζβd.\displaystyle=\left(2\pi\right)^{d}\sigma^{2}L_{\zeta}\beta^{d}.

This then shows that Θ×d(θ,s)w^θ(s)\Theta\times\mathbb{R}^{d}\ni\left(\theta,s\right)\mapsto\widehat{w}_{\theta}(s) is continuous as a composition of continuous functions and hence the proposition is proven. ∎

Proof of Proposition 6.2.

We first show that Assumption 5.1 is satisfied. We write θ1=(σ12,β1)\theta_{1}=\left(\sigma_{1}^{2},\beta_{1}\right) and θ2=(σ22,β2)\theta_{2}=\left(\sigma_{2}^{2},\beta_{2}\right) and show that θ1θ2\theta_{1}\neq\theta_{2} implies that ϕθ1(h)ϕθ2(h)\phi_{\theta_{1}}(\left\lVert h\right\rVert)\neq\phi_{\theta_{2}}(\left\lVert h\right\rVert) for all hB(0;min{β1,β2}){0}h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\}. Suppose first that β1=β2\beta_{1}=\beta_{2} but σ12σ22\sigma_{1}^{2}\neq\sigma_{2}^{2} we then have that ϕθ1(h)ϕθ2(h)\phi_{\theta_{1}}(\left\lVert h\right\rVert)\neq\phi_{\theta_{2}}(\left\lVert h\right\rVert) for all hB(0;min{β1,β2})h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right), since for any hB(0;min{β1,β2})h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right), h/β1=h/β2<1\left\lVert h\right\rVert/\beta_{1}=\left\lVert h\right\rVert/\beta_{2}<1 and thus ϕν,κ(h/β1)=ϕν,κ(h/β2)>0\phi_{\nu,\kappa}\left(\left\lVert h\right\rVert/\beta_{1}\right)=\phi_{\nu,\kappa}\left(\left\lVert h\right\rVert/\beta_{2}\right)>0. Suppose now that either σ12σ22\sigma_{1}^{2}\neq\sigma_{2}^{2}, with σ22<σ12\sigma_{2}^{2}<\sigma_{1}^{2} but β1β2\beta_{1}\neq\beta_{2}, or σ12=σ22=σ2\sigma_{1}^{2}=\sigma_{2}^{2}=\sigma^{2} but β1β2\beta_{1}\neq\beta_{2}. Then, let us assume that min{β1,β2}=β2\min\left\{\beta_{1},\beta_{2}\right\}=\beta_{2}. We have with h/β1<h/β2\left\lVert h\right\rVert/\beta_{1}<\left\lVert h\right\rVert/\beta_{2}, by monotonicity of rϕν,κ(r)r\mapsto\phi_{\nu,\kappa}(r), that either

ϕθ1(h)ϕθ2(h)=(ϕν,κ(hβ1)σ22σ12ϕν,κ(hβ2))>0\displaystyle\phi_{\theta_{1}}(\left\lVert h\right\rVert)-\phi_{\theta_{2}}(\left\lVert h\right\rVert)=\left(\phi_{\nu,\kappa}\left(\frac{\left\lVert h\right\rVert}{\beta_{1}}\right)-\frac{\sigma_{2}^{2}}{\sigma_{1}^{2}}\phi_{\nu,\kappa}\left(\frac{\left\lVert h\right\rVert}{\beta_{2}}\right)\right)>0

for all hB(0;min{β1,β2}){0}h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\} or

ϕθ1(h)ϕθ2(h)=σ2(ϕν,κ(hβ1)ϕν,κ(hβ2))>0\displaystyle\phi_{\theta_{1}}(\left\lVert h\right\rVert)-\phi_{\theta_{2}}(\left\lVert h\right\rVert)=\sigma^{2}\left(\phi_{\nu,\kappa}\left(\frac{\left\lVert h\right\rVert}{\beta_{1}}\right)-\phi_{\nu,\kappa}\left(\frac{\left\lVert h\right\rVert}{\beta_{2}}\right)\right)>0

for all hB(0;min{β1,β2}){0}h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\}. When min{β1,β2}=β1\min\left\{\beta_{1},\beta_{2}\right\}=\beta_{1}, we will in either of the above cases have ϕθ1(h)ϕθ2(h)<0\phi_{\theta_{1}}(\left\lVert h\right\rVert)-\phi_{\theta_{2}}(\left\lVert h\right\rVert)<0 for all hB(0;min{β1,β2}){0}h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\}. Further, we can also use a similar argument for the case where either σ12σ22\sigma_{1}^{2}\neq\sigma_{2}^{2}, with σ22>σ12\sigma_{2}^{2}>\sigma_{1}^{2} but β1>β2\beta_{1}>\beta_{2} or β1<β2\beta_{1}<\beta_{2}, or σ12=σ22=σ2\sigma_{1}^{2}=\sigma_{2}^{2}=\sigma^{2} but β1>β2\beta_{1}>\beta_{2} or β1<β2\beta_{1}<\beta_{2}. Thus we have shown that θ1θ2\theta_{1}\neq\theta_{2} implies that ϕθ1(h)ϕθ2(h)\phi_{\theta_{1}}(\left\lVert h\right\rVert)\neq\phi_{\theta_{2}}(\left\lVert h\right\rVert) for all hB(0;min{β1,β2}){0}h\in\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\}. Then, for τ=0\tau=0, since min{β1,β2}>1\min\left\{\beta_{1},\beta_{2}\right\}>1, B(0;min{β1,β2}){0}\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\setminus\{0\} at least contains integers z{pd:p=1}z\in\left\{p\in\mathbb{Z}^{d}\colon\left\lVert p\right\rVert=1\right\}. Therefore θ1θ2\theta_{1}\neq\theta_{2} implies ϕθ1(z)ϕθ2(z)\phi_{\theta_{1}}(z)\neq\phi_{\theta_{2}}(z) on {pd:p=1}\left\{p\in\mathbb{Z}^{d}\colon\left\lVert p\right\rVert=1\right\}. If τ(0,1/2)\tau\in\left(0,1/2\right), since min{β1,β2}>0\min\left\{\beta_{1},\beta_{2}\right\}>0, B(0;min{β1,β2})Dτ\mathrm{B}\left(0;\min\left\{\beta_{1},\beta_{2}\right\}\right)\cap D_{\tau} has non zero Lebesgue measure. We have thus shown that Assumption 5.1 is satisfied. Let us now show that {wθ:θΘ}\{w_{\theta}\colon\theta\in\Theta\} also satisfies Assumptions 5.2. To do so, fix some interval I=(0,b]+I=(0,b]\subset\mathbb{R}_{+}, where 12τ<b<β1-2\tau<b<\beta. We will show that for any θΘ\theta\in\Theta, there exists t0It_{0}\in I such that

W(ϕθσ2,ϕθβ)(t0)=det((ϕθσ2(t0)ϕθβ(t0)dϕθσ2dt(t0)dϕθβdt(t0)))0,W\left(\frac{\partial\phi_{\theta}}{\partial\sigma^{2}},\frac{\partial\phi_{\theta}}{\partial\beta}\right)(t_{0})=\operatorname{det}\left(\begin{pmatrix}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t_{0})&\frac{\partial\phi_{\theta}}{\partial\beta}(t_{0})\\ \frac{d\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}}{dt}(t_{0})&\frac{d\frac{\partial\phi_{\theta}}{\partial\beta}}{dt}(t_{0})\end{pmatrix}\right)\neq 0, (46)

where W(ϕθσ2,ϕθβ)(t0)W\big{(}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}},\frac{\partial\phi_{\theta}}{\partial\beta}\big{)}(t_{0}) is called the Wronskian of tϕθσ2(t)t\mapsto\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t) and tϕθβ(t)t\mapsto\frac{\partial\phi_{\theta}}{\partial\beta}(t) at t0It_{0}\in I. This then shows that the functions tϕθσ2(t)t\mapsto\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t) and tϕθβ(t)t\mapsto\frac{\partial\phi_{\theta}}{\partial\beta}(t) are linearly independent on the entire interval II, more explicitly, for any tIt\in I,

α1ϕθσ2(t)+α2ϕθβ(t)=0\displaystyle\alpha_{1}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t)+\alpha_{2}\frac{\partial\phi_{\theta}}{\partial\beta}(t)=0

will imply that α1=α2=0\alpha_{1}=\alpha_{2}=0. This then shows that there does not exist (α1,α2)2{0}\left(\alpha_{1},\alpha_{2}\right)\in\mathbb{R}^{2}\setminus\{0\}, such that for any θΘ\theta\in\Theta,

hα1ϕθσ2(h)+α2ϕθβ(h)=0,\displaystyle h\mapsto\alpha_{1}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(\left\lVert h\right\rVert)+\alpha_{2}\frac{\partial\phi_{\theta}}{\partial\beta}(\left\lVert h\right\rVert)=0,

a.e. with respect to the Lebesgue measure on on B[0;b]{0}\mathrm{B}\left[0;b\right]\setminus\{0\}. This then justifies, for both cases, either τ=0\tau=0, or τ>0\tau>0, that also Assumption 5.2 must be satisfied. Hence, let us show (46). We can calculate, using arguments from the proof of Proposition 6.1, that for tIt\in I,

ϕθσ2(t)\displaystyle\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}(t) =ϕν,κ(tβ),\displaystyle=\phi_{\nu,\kappa}\left(\frac{t}{\beta}\right),
ϕθβ(t)\displaystyle\frac{\partial\phi_{\theta}}{\partial\beta}(t) =2t2(κ1)β3cν,κ1cν,κσ2ϕν,κ1(tβ),\displaystyle=\frac{2t^{2}(\kappa-1)}{\beta^{3}}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\sigma^{2}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right),
dϕθσ2dt(t)\displaystyle\frac{d\frac{\partial\phi_{\theta}}{\partial\sigma^{2}}}{dt}(t) =2t(κ1)βcν,κ1cν,κϕν,κ1(tβ),\displaystyle=-\frac{2t(\kappa-1)}{\beta}\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right),

and

dϕθβdt(t)=4t(κ1)β3σ2cν,κ(cν,κ1ϕν,κ1(tβ)t2(κ2)βcν,κ2ϕν,κ2(tβ)).\displaystyle\frac{d\frac{\partial\phi_{\theta}}{\partial\beta}}{dt}(t)=\frac{4t(\kappa-1)}{\beta^{3}}\frac{\sigma^{2}}{c_{\nu,\kappa}}\left(\ c_{\nu,\kappa-1}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right)-\frac{t^{2}(\kappa-2)}{\beta}c_{\nu,\kappa-2}\phi_{\nu,\kappa-2}\left(\frac{t}{\beta}\right)\right).

Therefore we have that for tIt\in I, W(ϕθσ2,ϕθβ)(t)W\big{(}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}},\frac{\partial\phi_{\theta}}{\partial\beta}\big{)}(t) is given by

ϕν,κ(tβ)4t(κ1)β3σ2cν,κ(cν,κ1ϕν,κ1(tβ)t2(κ2)βcν,κ2ϕν,κ2(tβ))+4t3(κ1)2β4σ2(cν,κ1cν,κϕν,κ1(tβ))2.\begin{split}\phi_{\nu,\kappa}\left(\frac{t}{\beta}\right)\frac{4t(\kappa-1)}{\beta^{3}}\frac{\sigma^{2}}{c_{\nu,\kappa}}\left(\!c_{\nu,\kappa-1}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right)-\frac{t^{2}(\kappa-2)}{\beta}c_{\nu,\kappa-2}\phi_{\nu,\kappa-2}\left(\frac{t}{\beta}\right)\!\right)\\ \quad+\frac{4t^{3}(\kappa-1)^{2}}{\beta^{4}}\sigma^{2}\left(\frac{c_{\nu,\kappa-1}}{c_{\nu,\kappa}}\phi_{\nu,\kappa-1}\left(\frac{t}{\beta}\right)\right)^{2}.\end{split}

But the latter expression is not equal to zero on the entire II. To see it, assume by contradiction that indeed W(ϕθσ2,ϕθβ)(t)=0W\big{(}\frac{\partial\phi_{\theta}}{\partial\sigma^{2}},\frac{\partial\phi_{\theta}}{\partial\beta}\big{)}(t)=0 for all tIt\in I. Using standard algebraic manipulations one can show that this is equivalent to assume that the function

g(t)f1(t)f2(t),\displaystyle g(t)\coloneqq\frac{f_{1}(t)}{f_{2}(t)},

with

f1(t)(κ2)cν,κϕν,κ(tβ)cν,κ2ϕν,κ2(tβ)(κ1)(cν,κ1ϕν,κ1(tβ))2\displaystyle f_{1}(t)\coloneqq(\kappa-2)c_{\nu,\kappa}\phi_{\nu,\kappa}\bigg{(}\frac{t}{\beta}\bigg{)}c_{\nu,\kappa-2}\phi_{\nu,\kappa-2}\bigg{(}\frac{t}{\beta}\bigg{)}-(\kappa-1)\bigg{(}c_{\nu,\kappa-1}\phi_{\nu,\kappa-1}\bigg{(}\frac{t}{\beta}\bigg{)}\bigg{)}^{2}

and

f2(t)cν,κcν,κ1ϕν,κ(tβ)ϕν,κ1(tβ),\displaystyle f_{2}(t)\coloneqq c_{\nu,\kappa}c_{\nu,\kappa-1}\phi_{\nu,\kappa}\bigg{(}\frac{t}{\beta}\bigg{)}\phi_{\nu,\kappa-1}\bigg{(}\frac{t}{\beta}\bigg{)},

is constant equal to β\beta on II. But this makes no sense and thus we arrive at a contradiction. Hence, there exists t0It_{0}\in I such that (46) is satisfied, which shows that Assumption 5.2 is satisfied and thus concludes the proof of Proposition 6.2. ∎

Proof of Proposition 6.4.

The goal is to check that {(𝔗m,θ):θΘ}\big{\{}(\mathfrak{T}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2, then we conclude using Propositions 6.1 and 6.2, as well as Theorems A.1 and A.2. We first notice that for any θΘ\theta\in\Theta, m+m\in\mathbb{N}_{+} and any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, 𝔗m,θ\mathfrak{T}_{m,\theta} and q𝔗m,θθi1θiq\frac{\partial^{q}\mathfrak{T}_{m,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}} are Borel measurable functions on +\mathbb{R}_{+}. In addition, for any θΘ\theta\in\Theta and m+m\in\mathbb{N}_{+}, 𝔗m,θ\mathfrak{T}_{m,\theta} has support [0,Uθ,m][0,U_{\theta,m}], with Uθ,m=min{Cm,β}U_{\theta,m}=\min\left\{C_{m},\beta\right\} that satisfies supm+supθΘUθ,m=βmax\sup_{m\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}U_{\theta,m}=\beta_{\max}. Further, one can verify that the family {(𝔗m,θ):θΘ}\big{\{}(\mathfrak{T}_{m,\theta})\colon\theta\in\Theta\big{\}} is also uniformly bounded by σmax2\sigma^{2}_{\max} on +\mathbb{R}_{+} and it converges uniformly to ϕθ\phi_{\theta} on +\mathbb{R}_{+}, independent of θΘ\theta\in\Theta, that is supθΘ𝔗m,θϕθm0\sup_{\theta\in\Theta}\lVert\mathfrak{T}_{m,\theta}-\phi_{\theta}\rVert_{\infty}\xrightarrow[]{m\to\infty}0. Thus (1), (2) and (3) of Assumption A.2 are satisfied. To verify the remaining assumptions, we view 𝔗m,θ(t)\mathfrak{T}_{m,\theta}(t) as the result of a truncation operator g𝔗m(g)=g𝟙[0,Cm]g\mapsto\mathfrak{T}_{m}(g)=g\mathbbm{1}_{[0,C_{m}]} evaluated at tt. That is, 𝔗m,θ(t)=𝔗m(ϕθ)(t)\mathfrak{T}_{m,\theta}(t)=\mathfrak{T}_{m}(\phi_{\theta})(t). Then, we remark that for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, for any θΘ\theta\in\Theta,

q𝔗m,θθi1θiq(t)=𝔗m(qϕθθi1θiq)(t).\displaystyle\frac{\partial^{q}\mathfrak{T}_{m,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(t)=\mathfrak{T}_{m}\left(\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right)(t).

Thus, by Proposition 6.1, also (4) and (5) of Assumption A.2 are satisfied.

Proof of Proposition 6.5.

For any m+m\in\mathbb{N}_{+}, for any θΘ\theta\in\Theta, Bm,θ(t;bm)B_{m,\theta}(t;b_{m}) is continuous on [0,M][0,M] and it is also continuous as a function of θΘ\theta\in\Theta (see also the proof of Proposition 6.1). Further, it converges uniformly to ϕθ\phi_{\theta} on [0,M][0,M], independent of θΘ\theta\in\Theta. That is

supθΘsupt[0,M]|𝔓m,θ(t)ϕθ(t)|m0.\displaystyle\sup_{\theta\in\Theta}\sup_{t\in[0,M]}\left\lvert\mathfrak{P}_{m,\theta}(t)-\phi_{\theta}(t)\right\rvert\xrightarrow[]{m\to\infty}0.

To see this we can rely, for example, on the proof of Theorem 2.3.1 in [24]. There, it is shown that for any t[0,M]t\in[0,M] and θΘ\theta\in\Theta, for any ε>0\varepsilon>0, there exists δ(t)>0\delta(t)>0 such that

|Bm,θ(t;bm)ϕθ(t)|ε+2σmax2bmtmδ(t)2,\displaystyle\left\lvert B_{m,\theta}(t;b_{m})-\phi_{\theta}(t)\right\rvert\leq\varepsilon+2\sigma^{2}_{\max}\frac{b_{m}t}{m\delta(t)^{2}},

for mm large enough. Since [0,M][0,M] is compact and ϕθ\phi_{\theta} is continuous, we can choose δδ(t)\delta^{*}\equiv\delta(t), independent of t[0,M]t\in[0,M] and θΘ\theta\in\Theta, such that the above inequality is satisfied for arbitrary ε>0\varepsilon>0, with δ(t)\delta(t) replaced with δ\delta^{*}. Then, we conclude by taking the supremum on the left and right over [0,M][0,M] and θΘ\theta\in\Theta. For any m+m\in\mathbb{N}_{+}, for any θΘ\theta\in\Theta, we can write

|𝔓m,θ(t)ϕθ(t)|=|𝔓m,θ(t)ϕθ(t)|𝟙[0,M](t)+|𝔓m,θ(t)ϕθ(t)|𝟙(M,)(t).\displaystyle\left\lvert\mathfrak{P}_{m,\theta}(t)-\phi_{\theta}(t)\right\rvert=\left\lvert\mathfrak{P}_{m,\theta}(t)-\phi_{\theta}(t)\right\rvert\mathbbm{1}_{[0,M]}(t)+\left\lvert\mathfrak{P}_{m,\theta}(t)-\phi_{\theta}(t)\right\rvert\mathbbm{1}_{(M,\infty)}(t).

Notice that because MβmaxM\geq\beta_{\max}, the latter term is actually zero independent of θΘ\theta\in\Theta and thus we have that (𝔓m,θ)m+\left(\mathfrak{P}_{m,\theta}\right)_{m\in\mathbb{N}_{+}} converges uniformly to ϕθ\phi_{\theta} on the entire +\mathbb{R}_{+}, independent of θΘ\theta\in\Theta. Thus, we have that supθΘ𝔓m,θϕθm0\sup_{\theta\in\Theta}\lVert\mathfrak{P}_{m,\theta}-\phi_{\theta}\rVert_{\infty}\xrightarrow[]{m\to\infty}0. Note also that the convergence (in the uniform norm) of 𝔓m,θ\mathfrak{P}_{m,\theta} to ϕθ\phi_{\theta} in particular implies that the sequence of functions (𝔓m,θ)m+\left(\mathfrak{P}_{m,\theta}\right){}_{m\in\mathbb{N}_{+}} is bounded on +\mathbb{R}_{+} for any θΘ\theta\in\Theta. Therefore we can use that

supm+supt+supθΘ𝔓m,θ(t)=supm+supt[0,M]supθΘ𝔓m,θ(t),\displaystyle\sup_{m\in\mathbb{N}_{+}}\sup_{t\in\mathbb{R}_{+}}\sup_{\theta\in\Theta}\mathfrak{P}_{m,\theta}(t)=\sup_{m\in\mathbb{N}_{+}}\sup_{t\in[0,M]}\sup_{\theta\in\Theta}\mathfrak{P}_{m,\theta}(t),

to find C~M\widetilde{C}\coloneqq M and L~supm+supt[0,M]supθΘ𝔓m,θ(t)\widetilde{L}\coloneqq\sup_{m\in\mathbb{N}_{+}}\sup_{t\in[0,M]}\sup_{\theta\in\Theta}\mathfrak{P}_{m,\theta}(t), two constants, which are independent of m+m\in\mathbb{N}_{+} and θΘ\theta\in\Theta (recall that Θ\Theta is compact), such that (2) of Assumption A.2 is satisfied. Clearly, for any θΘ\theta\in\Theta and for any m+m\in\mathbb{N}_{+}, the function 𝔓m,θ:(+,𝔅(+))(,𝔅())\mathfrak{P}_{m,\theta}\colon(\mathbb{R}_{+},\mathfrak{B}(\mathbb{R}_{+}))\to(\mathbb{R},\mathfrak{B}(\mathbb{R})) is measurable. In conclusion we have shown that (1), (2) and (3) of Assumption A.2 are satisfied. In the proof of Proposition 6.1 we have shown that for any θΘ\theta\in\Theta, for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, there exist constants Cθ(i1,,iq)C_{\theta}(i_{1},\dotsc,i_{q}), Lθ(i1,,iq)<L_{\theta}(i_{1},\dotsc,i_{q})<\infty, such that

qϕθθi1θiq𝒞C(+;[0,Cθ(i1,,iq)]),qϕθθi1θiqLθ(i1,,iq),\displaystyle\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\in\mathcal{C}_{\text{C}}(\mathbb{R}_{+};[0,C_{\theta}(i_{1},\dotsc,i_{q})]),\quad{\left\lVert\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right\rVert}_{\infty}\leq L_{\theta}(i_{1},\dotsc,i_{q}),

where for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, supθΘCθ(i1,,iq)βmax\sup_{\theta\in\Theta}C_{\theta}(i_{1},\dotsc,i_{q})\leq\beta_{\max}. In addition, we notice that for any q=1,2,3q=1,2,3, i1,,iq{1,,p}i_{1},\dotsc,i_{q}\in\{1,\dotsc,p\}, for any θΘ\theta\in\Theta,

q𝔓m,θθi1θiq(t)=𝔓m(qϕθθi1θiq)(t),\displaystyle\frac{\partial^{q}\mathfrak{P}_{m,\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}(t)=\mathfrak{P}_{m}\left(\frac{\partial^{q}\phi_{\theta}}{\partial\theta_{i_{1}}\cdots\partial\theta_{i_{q}}}\right)(t),

where g𝔓m(g)g\mapsto\mathfrak{P}_{m}(g) is the Bernstein polynomial operator for a function gg with support included [0,M][0,M]:

𝔓m(g)(t)=k=0mg(bmkm)(mk)(tbm)k(1tbm)mk,\displaystyle\mathfrak{P}_{m}(g)(t)=\sum_{k=0}^{m}g\bigg{(}b_{m}\frac{k}{m}\bigg{)}\binom{m}{k}\bigg{(}\frac{t}{b_{m}}\bigg{)}^{k}\bigg{(}1-\frac{t}{b_{m}}\bigg{)}^{m-k},

for tMt\leq M and zero otherwise. Therefore, we can rely on the same arguments that we have used to show that (2) and (3) of Assumption A.2 are satisfied, to show that also (4) and (5) of Assumption A.2 must be satisfied. This then concludes the proof of Proposition 6.5. ∎

Proof of Proposition 6.7.

The proof follows the same reasoning as the proof of Proposition 6.5. ∎

Proof of Proposition 6.9.

Since (δ(m))m+\left(\delta(m)\right){}_{m\in\mathbb{N}_{+}} does not depend on θΘ\theta\in\Theta and t+t\in\mathbb{R}_{+}, and is such that δ(m)0\delta(m)\xrightarrow{}0, as mm\xrightarrow{}\infty we can see that {(𝔖m,θ):θΘ}\big{\{}(\mathfrak{S}_{m,\theta})\colon\theta\in\Theta\big{\}} satisfies Assumption A.2. Thus, using Propositions 6.1 and 6.2, under application of Theorems A.1 and A.2, the proposition is proven. ∎

C.6 Proof of results in Section 7

Proof of Theorem 7.1.

Given a collection S(n)S_{(n)} of SS, let Kn,θi,j=kθ(SiSj){K_{n,\theta}}_{i,j}=k_{\theta}(S_{i}-S_{j}), 1i,jn1\leq i,j\leq n, denote the n×nn\times n covariance matrix based on the family {kθ:θΘ}\{k_{\theta}\colon\theta\in\Theta\}. We first note that under the given assumptions on the family {kθ:θΘ}\{k_{\theta}\colon\theta\in\Theta\}, we have that

supn+supθΘKn,θ2< and infn+infθΘλn(Kn,θ)>0,\displaystyle\sup_{n\in\mathbb{N}_{+}}\sup_{\theta\in\Theta}\big{\lVert}K_{n,\theta}\big{\rVert}_{2}<\infty\text{ and }\inf_{n\in\mathbb{N}_{+}}\inf_{\theta\in\Theta}\lambda_{n}(K_{n,\theta})>0,

with \mathbb{P} probability one. This can be seen from Proposition D.4 and Lemma D.5 in [4]. Using this, the proof of (15) is immediate, it follows from Lemmas 4.1 and B.6.

If we proof

dn,θ^n(kt)=infθΘdn,θ+δn,as n,d_{n,\hat{\theta}_{n}(kt)}=\inf_{\theta\in\Theta}d_{n,\theta}+\delta^{\prime}_{n},\quad\text{as $n\xrightarrow[]{}\infty$,} (47)

where δnn0\delta^{\prime}_{n}\xrightarrow[n\to\infty]{\mathbb{P}}0, (16) follows from (15), and we are done. We note that (47) is established if we prove

supθΘ|ln,t-ML(θ)𝔼[ln,t-ML(θ)|S(n)]|n0,\sup_{\theta\in\Theta}\left\lvert l_{n,\text{\tiny t-ML}}(\theta)-\mathbb{E}\left[l_{n,\text{\tiny t-ML}}(\theta)\;|\;S_{(n)}\right]\right\rvert\xrightarrow[n\to\infty]{\mathbb{P}}0, (48)

where

ln,t-ML(θ)1nlog(det(Rn,θ))+1nZ(n),Rn,θ1Z(n),\displaystyle l_{n,\text{\tiny t-ML}}(\theta)\coloneqq\frac{1}{n}\operatorname{log}\left(\operatorname{det}\left(R_{n,\theta}\right)\right)+\frac{1}{n}\big{\langle}Z_{(n)},R_{n,\theta}^{-1}Z_{(n)}\big{\rangle},

the random version of the modified log-likelihood function based on the tapered covariance function. This is seen from the proof of Theorem 3.3 in [5]. But under the given assumptions, the family {kθtβ0:θΘ}\{k_{\theta}t_{\beta_{0}}\colon\theta\in\Theta\} satisfies Assumption 3.1 (regarding (2), up to q=1q=1 and the continuity of first order partial derivatives). Thus (48) can be shown as it was shown (see (39)) in the proof of Theorem 5.2. ∎

Acknowledgments

The authors thank Roman Flury for all the stimulating discussions that were held during the development of this work. This work was supported by the Swiss National Science Foundation SNSF-175529.

References

  • [1] Milton Abramowitz and Irene A. Stegun, editors. Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards Applied Mathematics Series, No. 55. U. S. Government Printing Office, Washington, D. C., 1965. Superintendent of Documents.
  • [2] Ethan Anderes. On the consistent separation of scale and variance for Gaussian random fields. The Annals of Statistics, 38(2):870–893, 2010.
  • [3] Richard Askey. Radial characteristic functions. Technical report, research center, univ. Wisconsin-Madison, Madison, WI, 1973.
  • [4] François Bachoc. Asymptotic analysis of the role of spatial sampling for covariance parameter estimation of Gaussian processes. Journal of Multivariate Analysis, 125:1–35, 2014.
  • [5] François Bachoc. Asymptotic analysis of covariance parameter estimation for Gaussian processes in the misspecified case. Bernoulli, 24(2):1531–1575, 2018.
  • [6] François Bachoc. Asymptotic analysis of maximum likelihood estimation of covariance parameters for Gaussian processes: An introduction with proofs. In Abdelaati Daouia and Anne Ruiz-Gazen, editors, Advances in Contemporary Statistics and Econometrics, pages 283–303. Springer, Cham, 2021.
  • [7] François Bachoc, José Betancourt, Reinhard Furrer, and Thierry Klein. Asymptotic properties of the maximum likelihood and cross validation estimators for transformed Gaussian processes. Electronic Journal of Statistics, 14(1):1962–2008, 2020.
  • [8] François Bachoc and Reinhard Furrer. On the smallest eigenvalues of covariance matrices of multivariate spatial processes. Stat, 5:102–107, 2016.
  • [9] Moreno Bevilacqua, Tarik Faouzi, Reinhard Furrer, and Emilio Porcu. Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics. The Annals of Statistics, 47(2):828–856, 2019.
  • [10] Federico Blasi, Christian Caamaño Carrillo, Moreno Bevilacqua, and Reinhard Furrer. A selective view of climatological data and likelihood estimation. Spatial Statistics, 50:Paper No. 100596, 2022.
  • [11] Andrew Chernih and Simon Hubbert. Closed form representations and properties of the generalised Wendland functions. Journal of Approximation Theory, 177:17–33, 2014.
  • [12] I. Chlodovsky. Sur le développement des fonctions définies dans un intervalle infini en séries de polynomes de M. S. Bernstein. Compositio Mathematica, 4:380–393, 1937.
  • [13] Noel A. C. Cressie. Statistics for Spatial Data. John Wiley & Sons, Inc., New York, 1993. Reprint, A Wiley-Interscience Publication.
  • [14] Juan Du, Hao Zhang, and V. S. Mandrekar. Fixed-domain asymptotic properties of tapered maximum likelihood estimators. The Annals of Statistics, 37(6A):3330–3361, 2009.
  • [15] Roman Flury and Reinhard Furrer. Discussion on competition for spatial statistics for large datasets. Journal of Agricultural, Biological, and Environmental Statistics, 26:599–603, 2021.
  • [16] Reinhard Furrer, François Bachoc, and Juan Du. Asymptotic properties of multivariate tapering for estimation and prediction. Journal of Multivariate Analysis, 149:177–191, 2016.
  • [17] Reinhard Furrer, Marc G. Genton, and Douglas Nychka. Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15(3):502–523, 2006.
  • [18] Gregory Gaspari and Stephen E. Cohn. Construction of correlation functions in two and three dimensions. Quarterly Journal of the Royal Meteorological Society, 125(554):723–757, 1999.
  • [19] Florian Gerber, Kaspar Mösinger, and Reinhard Furrer. Extending r packages to support 64-bit compiled code: An illustration with spam64 and GIMMS NDVI3g data. Computers & Geosciences, 104:109–119, 2017.
  • [20] Tilmann Gneiting. Correlation functions for atmospheric data analysis. Quarterly Journal of the Royal Meteorological Society, 125(559):2449–2464, 1999.
  • [21] Tilmann Gneiting. Compactly supported correlation functions. Journal of Multivariate Analysis, 83(2):493–508, 2002.
  • [22] Matthew J. Heaton, Abhirup Datta, Andrew O. Finley, and et al. A case study competition among methods for analyzing large spatial data. Journal of Agricultural, Biological, and Environmental Statistics, 24(3):398–425, 2019.
  • [23] Cari G. Kaufman, Mark J. Schervish, and Douglas W. Nychka. Covariance tapering for likelihood-based estimation in large spatial data sets. Journal of the American Statistical Association, 103(484):1545–1555, 2008.
  • [24] G. G. Lorentz. Bernstein Polynomials. Mathematical Expositions, No. 8. University of Toronto Press, Toronto, 1953.
  • [25] K. V. Mardia and R. J. Marshall. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71(1):135–146, 1984.
  • [26] Bertil Matérn. Spatial Variation: Stochastic Models and their Application to some Problems in Forest Surveys and other Sampling Investigations. Meddelanden Fran Statens Skogsforskningsinstitut, Band 49, Nr. 5, Stockholm, 1960.
  • [27] Whitney K. Newey. Uniform convergence in probability and stochastic equicontinuity. Econometrica. Journal of the Econometric Society, 59(4):1161–1167, 1991.
  • [28] Robert Schaback. The missing Wendland functions. Advances in Computational Mathematics, 34(1):67–81, 2011.
  • [29] Elias M. Stein and Guido Weiss. Introduction to Fourier Analysis on Euclidean Spaces. Princeton Mathematical Series, No. 32. Princeton University Press, Princeton, N.J., 1971.
  • [30] Michael L. Stein. Statistical properties of covariance tapers. Journal of Computational and Graphical Statistics, 22(4):866–885, 2013.
  • [31] A. W. Van der Vaart. Asymptotic Statistics, volume 3 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 1998.
  • [32] Holger Wendland. Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in Computational Mathematics, 4(4):389–396, 1995.
  • [33] Holger Wendland. Scattered Data Approximation, volume 17 of Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2005.
  • [34] Zhiliang Ying. Asymptotic properties of a maximum likelihood estimator with data from a Gaussian process. Journal of Multivariate Analysis, 36(2):280–296, 1991.
  • [35] V. P. Zastavnyi. On some properties of the Buhmann functions. Ukrainian Mathematical Journal, 58(8):1045–1067, 2006.
  • [36] Hao Zhang. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journal of the American Statistical Association, 99(465):250–261, 2004.
  • [37] Hao Zhang and Dale L. Zimmerman. Towards reconciling two asymptotic frameworks in spatial statistics. Biometrika, 92(4):921–936, 2005.