This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bahadur efficiencies of the Epps–Pulley test for normality

Bruno Ebner and Norbert Henze
Abstract

The test for normality suggested by Epps and Pulley [9] is a serious competitor to tests based on the empirical distribution function. In contrast to the latter procedures, it has been generalized to obtain a genuine affine invariant and universally consistent test for normality in any dimension. We obtain approximate Bahadur efficiencies for the test of Epps and Pulley, thus complementing recent results of Milos̆ević et al. (see [15]). For certain values of a tuning parameter that is inherent in the Epps–Pulley test, this test outperforms each of its competitors considered in [15], over the whole range of six close alternatives to normality.

In memoriam Yakov Yu. Nikitin (1947–2020)

00footnotetext: MSC 2010 subject classifications. Primary 62F05; Secondary 62G2000footnotetext: Test for normality; empirical characteristic function; Bahadur efficiency; eigenvalues of integral operators

1 Introduction

The purpose of this article is to derive Bahadur efficiencies for the test of normality proposed by Epps and Pulley [9], thus complementing recent results of Milos̆ević et al. [15], who confined their study to tests of normality based on the empirical distribution function. To be specific, suppose X1,X2,X_{1},X_{2},\ldots is a sequence of independent and identically distributed (i.i.d.) copies of a random variable XX that has an absolutely continuous distribution with respect to Lebesgue measure. To test the hypothesis H0H_{0} that the distribution of XX is some unspecified non-degenerate normal distribution, Epps and Pulley [9] proposed to use the test statistic

Tn,β=n|ψn(t)et2/2|2φβ(t)dt.T_{n,\beta}=n\int_{-\infty}^{\infty}\Big{|}\psi_{n}(t)-{\rm e}^{-t^{2}/2}\Big{|}^{2}\,\varphi_{\beta}(t)\,{\rm d}t.

Here, ψn(t)=n1j=1nexp(itYn,j)\psi_{n}(t)=n^{-1}\sum_{j=1}^{n}\exp\big{(}{\rm i}tY_{n,j}\big{)} is the empirical characteristic function of the so-called scaled residuals Yn,1,,Yn,nY_{n,1},\ldots,Y_{n,n}, where Yn,j=Sn1(XjX¯n)Y_{n,j}=S_{n}^{-1}(X_{j}-\overline{X}_{n}), j=1,,nj=1,\ldots,n, and X¯n=n1j=1nXj\overline{X}_{n}=n^{-1}\sum_{j=1}^{n}X_{j}, Sn2=n1j=1n(XjX¯n)2S_{n}^{2}=n^{-1}\sum_{j=1}^{n}(X_{j}-\overline{X}_{n})^{2} are the sample mean and the sample variance of X1,,XnX_{1},\ldots,X_{n}, respectively, and β>0\beta>0 is a so-called tuning parameter. Moreover,

φβ(t)=1β2πexp(t22β2),t,\varphi_{\beta}(t)=\frac{1}{\beta\sqrt{2\pi}}\exp\Big{(}-\frac{t^{2}}{2\beta^{2}}\Big{)},\quad t\in\mathbb{R},

is the density of the centred normal distribution with variance β2\beta^{2}. A closed-form expression of Tn,βT_{n,\beta} that is amenable to computational purposes is

Tn,β=1nj,k=1nexp(β22(Yn,jYn,k)2)21+β2j=1nexp(β2Yn,j22(1+β2))+n1+2β2.T_{n,\beta}=\frac{1}{n}\sum_{j,k=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\big{(}Y_{n,j}\!-\!Y_{n,k}\big{)}^{2}\!\bigg{)}\!-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\sum_{j=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}Y_{n,j}^{2}}{2(1\!+\!\beta^{2})}\!\bigg{)}+\frac{n}{\sqrt{1\!+\!2\beta^{2}}}. (1.1)

Epps and Pulley did not obtain neither the limit null distribution of Tn,βT_{n,\beta} as nn\to\infty nor the consistency of a test for normality that rejects H0H_{0} for large values of Tn,βT_{n,\beta}. Their procedure, however, turned out to be a serious competitor to the classical tests of Shapiro–Wilk, Shapiro–Francia and Anderson–Darling in simulation studies (see [3]). In the special case β=1\beta=1, Baringhaus and Henze [5] generalized the approach of Epps and Pulley to obtain a genuine test of multivariate normality, and they derived the limit null distribution of Tn,1T_{n,1}. Moreover, they proved the consistency of the test of Epps and Pulley against each alternative to normality having a finite second moment. The latter restriction was removed by S. Csörgő [6]. By an approach different from that adopted in [5], Henze and Wagner [12] obtained both the limit null distribution and the limit distribution of Tn,βT_{n,\beta} under contiguous alternatives to normality. Under fixed alternatives to normality, the limit distribution of Tn,βT_{n,\beta} is normal, as elaborated by [4] in much greater generality for weighted L2L^{2}-statistics. For more information on Tn,βT_{n,\beta}, especially on the role of the tuning parameter β\beta, see Section 2.2 of [7].

Notice that Tn,βT_{n,\beta} is invariant with respect to affine transformations XjaXj+bX_{j}\mapsto aX_{j}+b, where a,ba,b\in\mathbb{R} and a0a\neq 0. Hence, under H0H_{0}, both the finite-sample and the asymptotic distribution of Tn,βT_{n,\beta} do not depend on the parameters μ\mu and σ2\sigma^{2} of the underlyling normal distribution N(μ,σ2)(\mu,\sigma^{2}). Under H0H_{0}, we will thus assume μ=0\mu=0 and σ2=1\sigma^{2}=1. The rest of the paper unfolds as follows: In Section 2, we revisit the notion of approximate Bahadur efficiency. Sections 3 and 4 deal with stochastic limits and local Bahadur slopes, and Section 5 tackles an eigenvalue problem connected with the limit null distribution of the test statistic. The final section 6 contains results regarding local approximate Bahadur efficiencies of the Epps–Pulley test for the six close alternatives considered in [15] and a wide spectrum of values of the tuning parameter β\beta.

2 Approximate Bahadur efficiency

There are several options to compare different tests for the same testing problem as the sample size nn tends to infinity, see [16]. One of these options is asymptotic efficiency due to Bahadur (see [1]). This notion of asymptotic efficiency requires knowledge of the large deviation function of the test statistic. Apart from the notable exception given in [18], such knowledge, however, is hitherto not available for statistics that contain estimated parameters, like Tn,βT_{n,\beta} given in (1.1). To circumvent this drawback, one usually employs the so-called approximate Bahadur efficiency, which only requires results on the tail behavior of the limit distribution of the test statistic under the null hypothesis. To be more specific with respect to the title of this paper, let X,X1,X2,X,X_{1},X_{2},\ldots be a sequence of i.i.d. random variables, where the distribution of XX depends on a real-valued parameter ϑΘ\vartheta\in\Theta, where Θ\Theta denotes the parameter space, and only the case ϑ=0\vartheta=0 corresponds to the case that the distribution of XX is standard normal. Suppose, (Sn)n1(S_{n})_{n\geq 1}, where Sn=Sn(X1,,Xn)S_{n}=S_{n}(X_{1},\ldots,X_{n}), is a sequence of test statistics of the hypothesis H0:ϑ=0H_{0}:\vartheta=0 against the alternative H1H_{1}: ϑΘ{0}\vartheta\in\Theta\setminus\{0\}. Furthermore, suppose that rejection of H0H_{0} is for large values of SnS_{n}. The sequence (Sn)(S_{n}) is called a standard sequence, if the following conditions hold (see, e.g., [16], p.10, or [8], p. 3427):

  • There is a continuous distribution function GG such that, for ϑ=0\vartheta=0,

    limn0(Snx)=G(x),x.\lim_{n\to\infty}\mathbb{P}_{0}(S_{n}\leq x)=G(x),\quad x\in\mathbb{R}. (2.1)
  • There is a constant aSa_{S}, 0<aS<0<a_{S}<\infty, such that

    log(1G(x))=aSx22(1+o(1))asx.\log(1-G(x))=-\frac{a_{S}x^{2}}{2}(1+o(1))\ \text{as}\ x\to\infty. (2.2)
  • There is a real-valued function bS(ϑ)b_{S}(\vartheta) on Θ{0}\Theta\setminus\{0\}, with 0<bS(ϑ)<0<b_{S}(\vartheta)<\infty, such that, for each ϑΘ{0}\vartheta\in\Theta\setminus\{0\},

    SnnϑbS(ϑ).\frac{S_{n}}{\sqrt{n}}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}}b_{S}(\vartheta). (2.3)

Then the so-called approximate Bahadur slope

cS(ϑ)=aSbS2(ϑ),ϑΘ{0},c_{S}^{*}(\vartheta)=a_{S}\cdot b_{S}^{2}(\vartheta),\quad\vartheta\in\Theta\setminus\{0\},

is a measure of approximate Bahadur efficiency. Usually, it is true that cS(ϑ)(S)ϑ2c_{S}^{*}(\vartheta)\sim\ell(S)\cdot\vartheta^{2} as ϑ0\vartheta\to 0. In this case (S)\ell(S) is called the local (approximate) index of the sequence (Sn)(S_{n}). We will see that the sequence (Sn)(S_{n}), where Sn:=Tn,βS_{n}:=\sqrt{T_{n,\beta}}, is a standard sequence. To this end, we will derive the stochastic limit of Tn,β/nT_{n,\beta}/n for a general alternative in Section 3. In Section 4, we will specialize this stochastic limit for local alternatives, and we will derive the local index for the Epps–Pulley test statistic.

3 Stochastic limit of Tn,β/nT_{n,\beta}/n

To calculate the asymptotic Bahadur efficiency of the test of Epps and Pulley, we need the following result.

Theorem 3.1.

Suppose that 𝔼(X2)<\mathbb{E}(X^{2})<\infty. Then

Tnβn𝔼[exp(β2(Y1Y2)22)]21+β2𝔼[exp(β2Y122(1+β2))]+11+2β2.\frac{T_{n\beta}}{n}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}$}}}{{\longrightarrow}}\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}(Y_{1}-Y_{2})^{2}}{2}\right)\bigg{]}-\frac{2}{\sqrt{1+\beta^{2}}}\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}Y_{1}^{2}}{2(1\!+\!\beta^{2})}\right)\bigg{]}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}.

Here, \stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}$}}}{{\longrightarrow}} denotes convergence in probability, and Yj=(Xjμ)/σY_{j}=(X_{j}-\mu)/\sigma, j1j\geq 1, where μ=𝔼(X)\mu=\mathbb{E}(X) and σ2=𝕍(X)\sigma^{2}=\mathbb{V}(X).

Proof. From (1.1), we have

Tn,βn\displaystyle\frac{T_{n,\beta}}{n} =\displaystyle= 1n2j,k=1nexp(β22(XjXkSn)2)\displaystyle\frac{1}{n^{2}}\sum_{j,k=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\left(\frac{X_{j}-X_{k}}{S_{n}}\right)^{2}\!\bigg{)}
21+β21nj=1nexp(β22(1+β2)(XjX¯nSn)2)+11+2β2\displaystyle-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\cdot\frac{1}{n}\sum_{j=1}^{n}\exp\!\bigg{(}\!-\frac{\beta^{2}}{2(1\!+\!\beta^{2})}\left(\frac{X_{j}-\overline{X}_{n}}{S_{n}}\right)^{2}\!\bigg{)}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}
=:\displaystyle=: An,121+β2An,2+11+2β2\displaystyle A_{n,1}-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\cdot A_{n,2}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}

(say). By symmetry, it follows that

𝔼(An,1)\displaystyle\mathbb{E}\big{(}A_{n,1}\big{)} =\displaystyle= 1n+n1n𝔼[exp(β22(X1X2Sn)2],\displaystyle\frac{1}{n}+\frac{n-1}{n}\cdot\mathbb{E}\bigg{[}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\left(\frac{X_{1}-X_{2}}{S_{n}}\right)^{2}\!\bigg{]},
𝔼(An,2)\displaystyle\mathbb{E}\big{(}A_{n,2}\big{)} =\displaystyle= 𝔼[exp(β22(1+β2)(X1X¯nSn)2)].\displaystyle\mathbb{E}\bigg{[}\exp\!\bigg{(}\!-\frac{\beta^{2}}{2(1\!+\!\beta^{2})}\left(\frac{X_{1}-\overline{X}_{n}}{S_{n}}\right)^{2}\!\bigg{)}\bigg{]}.

Since X¯nμ\overline{X}_{n}\rightarrow\mu and SnσS_{n}\rightarrow\sigma almost surely as nn\to\infty by the strong law of large numbers, it follows from Lebesgue’s dominated convergence theorem that

limn𝔼(An,1)=𝔼[exp(β2(Y1Y2)22)],limn𝔼(An,2)=𝔼[exp(β2Y122(1+β2))],\lim_{n\to\infty}\mathbb{E}\big{(}A_{n,1}\big{)}=\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}(Y_{1}-Y_{2})^{2}}{2}\right)\bigg{]},\quad\lim_{n\to\infty}\mathbb{E}\big{(}A_{n,2}\big{)}=\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}Y_{1}^{2}}{2(1\!+\!\beta^{2})}\right)\bigg{]},

and thus the expectation of Tn,β/nT_{n,\beta}/n converges to the stochastic limit figuring in Theorem 3.1. Likewise, the variance of Tn,β/nT_{n,\beta}/n converges to zero.


4 Local Bahadur slopes

As was done in Milos̆ević et al. [15], we now assume that 𝒢={G(x;ϑ)}{\cal G}=\{G(x;\vartheta)\} is a family of distribution functions (DF’s) with densities g(x;ϑ)g(x;\vartheta), such that ϑ=0\vartheta=0 corresponds to the standard normal DF Φ\Phi and density φ\varphi, and each of the distributions for ϑ0\vartheta\neq 0 is non-normal. Moreover, we assume that the regularity assumptions WD in [17] are satisfied. If X,X1,X2,X,X_{1},X_{2},\ldots are i.i.d. random variables with DF G(;ϑ)G(\cdot;\vartheta), we have to consider the stochastic limit figuring in Theorem 3.1 as a function of ϑ\vartheta and expand this function at ϑ=0\vartheta=0. To this end, let

γ=β22,δ=β22(1+β2).\gamma=\frac{\beta^{2}}{2},\ \delta=\frac{\beta^{2}}{2(1+\beta^{2})}. (4.1)

Then, putting

μ(ϑ)\displaystyle\mu(\vartheta) =\displaystyle= xg(x;ϑ)dx,\displaystyle\int xg(x;\vartheta)\,\text{d}x,
σ2(ϑ)\displaystyle\sigma^{2}(\vartheta) =\displaystyle= x2g(x;ϑ)dxμ2(ϑ),\displaystyle\int x^{2}g(x;\vartheta)\,\text{d}x-\mu^{2}(\vartheta),

Theorem 3.1 yields

Tn,βnϑbTβ(ϑ),\frac{T_{n,\beta}}{n}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}}b_{T_{\beta}}(\vartheta),

where ϑ\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}} denotes convergence in probability under the true parameter ϑ\vartheta, and

bTβ(ϑ)\displaystyle b_{T_{\beta}}(\vartheta) =\displaystyle= exp(γ(xy)2σ2(ϑ))g(x;ϑ)g(y;ϑ)dxdy\displaystyle\iint\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)g(y;\vartheta)\,\text{d}x\,\text{d}y (4.3)
21+β2exp(δ(xμ(ϑ))2σ2(ϑ))g(x;ϑ)dx+11+2β2.\displaystyle\hskip 14.22636pt-\frac{2}{\sqrt{1+\beta^{2}}}\,\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x+\frac{1}{\sqrt{1+2\beta^{2}}}.

Here and in what follows, each unspecified integral is over \mathbb{R}.

Notice that bTβ(0)=0b_{T_{\beta}}(0)=0. We have to find the quadratic (first non-vanishing) term in the Taylor expansion of bTβb_{T_{\beta}} around zero, i.e, we look for some (local index) Δβ>0\Delta_{\beta}>0 such that

bTβ(ϑ)=Δβϑ2+o(ϑ2)asϑ0.b_{T_{\beta}}(\vartheta)=\Delta_{\beta}\,\vartheta^{2}+o(\vartheta^{2})\quad\text{as}\ \vartheta\to 0.

Writing gϑ(x;ϑ)g^{\prime}_{\vartheta}(x;\vartheta), gϑ′′(x;ϑ)g^{\prime\prime}_{\vartheta}(x;\vartheta) for derivatives of g(x;ϑ)g(x;\vartheta) with respect to ϑ\vartheta, we have

g(x;ϑ)=φ(x)+ϑgϑ(x;0)+ϑ22gϑ′′(x;0)+O(ϑ3)g(x;\vartheta)=\varphi(x)+\vartheta\cdot g^{\prime}_{\vartheta}(x;0)+\frac{\vartheta^{2}}{2}g^{\prime\prime}_{\vartheta}(x;0)+O(\vartheta^{3})

and thus – since μ(0)=0\mu(0)=0 and σ2(0)=1\sigma^{2}(0)=1

μ(ϑ)\displaystyle\mu(\vartheta) =\displaystyle= ϑxgϑ(x;0)dx+ϑ22xgϑ′′(x;0)dx+O(ϑ3),\displaystyle\vartheta\int xg^{\prime}_{\vartheta}(x;0)\,\text{d}x+\frac{\vartheta^{2}}{2}\int xg^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta^{3}),
σ2(ϑ)\displaystyle\sigma^{2}(\vartheta) =\displaystyle= 1+ϑx2gϑ(x;0)dx+ϑ22x2gϑ′′(x;0)dxμ(ϑ)2+O(ϑ3).\displaystyle 1+\vartheta\int x^{2}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+\frac{\vartheta^{2}}{2}\int x^{2}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x-\mu(\vartheta)^{2}+O(\vartheta^{3}).

Consequently, putting

μ1:=μ(0),μ2:=μ′′(0),σ1:=(σ2)(0),σ2:=(σ2)′′(0)\mu_{1}:=\mu^{\prime}(0),\ \mu_{2}:=\mu^{\prime\prime}(0),\ \sigma_{1}:=(\sigma^{2})^{\prime}(0),\ \sigma_{2}:=(\sigma^{2})^{\prime\prime}(0) (4.4)

for the sake of brevity, it follows that

μ1\displaystyle\mu_{1} =\displaystyle= xgϑ(x;0)dx,μ2=xgϑ′′(x;0)dx,\displaystyle\int xg^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad\mu_{2}=\int xg^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x,
σ1\displaystyle\sigma_{1} =\displaystyle= x2gϑ(x;0)dx,σ2=x2gϑ′′(x;0)dx2μ(0)2.\displaystyle\int x^{2}g^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad\sigma_{2}=\int x^{2}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x-2\mu^{\prime}(0)^{2}.

To tackle the integral that figures in (4.3), notice that

g(x;ϑ)g(y;ϑ)\displaystyle g(x;\vartheta)g(y;\vartheta) =\displaystyle= φ(x)φ(y)+ϑ[gϑ(x;0)φ(y)+gϑ(y;0)φ(x)]\displaystyle\varphi(x)\varphi(y)+\vartheta\big{[}g^{\prime}_{\vartheta}(x;0)\varphi(y)+g^{\prime}_{\vartheta}(y;0)\varphi(x)\big{]}
+ϑ2[12gϑ′′(x;0)φ(y)+12gϑ′′(y;0)φ(x)+gϑ(x;0)gϑ(y;0)]+O(ϑ3).\displaystyle+\vartheta^{2}\Big{[}\frac{1}{2}g^{\prime\prime}_{\vartheta}(x;0)\varphi(y)+\frac{1}{2}g^{\prime\prime}_{\vartheta}(y;0)\varphi(x)+g^{\prime}_{\vartheta}(x;0)g^{\prime}_{\vartheta}(y;0)\Big{]}+O(\vartheta^{3}).

Moreover, it follows from a geometric series expansion that

1σ2(ϑ)=1ϑσ1+ϑ2[σ12σ22]+O(ϑ3)\frac{1}{\sigma^{2}(\vartheta)}=1-\vartheta\sigma_{1}+\vartheta^{2}\Big{[}\sigma_{1}^{2}-\frac{\sigma_{2}}{2}\Big{]}+O(\vartheta^{3}) (4.5)

(say). From an expansion of the exponential function, we thus obtain

exp(γ(xy)2σ2(ϑ))\displaystyle\!\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)
=\displaystyle\!=\! eγ(xy)2[1+ϑσ1γ(xy)2+ϑ2{12σ12γ2(xy)4(σ12σ22)γ(xy)2}]+O(ϑ3).\displaystyle\!{\rm e}^{-\gamma(x-y)^{2}}\Big{[}1+\vartheta\sigma_{1}\gamma(x\!-\!y)^{2}+\vartheta^{2}\Big{\{}\frac{1}{2}\,\sigma_{1}^{2}\gamma^{2}(x\!-\!y)^{4}-\Big{(}\sigma_{1}^{2}\!-\!\frac{\sigma_{2}}{2}\Big{)}\gamma(x\!-\!y)^{2}\Big{\}}\Big{]}+O(\vartheta^{3}).

Using

eγ(xy)2(xy)2kφ(x)φ(y)dxdy=4kΓ(k+1/2)π(4γ+1)k+1/2,k=0,1,2,\iint{\rm e}^{-\gamma(x-y)^{2}}(x-y)^{2k}\varphi(x)\varphi(y)\,\text{d}x\text{d}y=\frac{4^{k}\Gamma(k+1/2)}{\sqrt{\pi}(4\gamma+1)^{k+1/2}},\quad k=0,1,2,
eγ(xy)2φ(x)dx=11+β2eδy2\int{\rm e}^{-\gamma(x-y)^{2}}\varphi(x)\,\text{d}x=\frac{1}{\sqrt{1+\beta^{2}}}\cdot{\rm e}^{-\delta y^{2}}
eγ(xy)2(xy)2φ(y)dy=eδx2x2+β2+1(1+β2)5/2,\int{\rm e}^{-\gamma(x-y)^{2}}(x-y)^{2}\varphi(y)\,\text{d}y={\rm e}^{-\delta x^{2}}\cdot\frac{x^{2}+\beta^{2}+1}{(1+\beta^{2})^{5/2}},

and putting

D0=eγ(xy)2gϑ(x;0)gϑ(y;0)dxdy,D_{0}=\iint{\rm e}^{-\gamma(x-y)^{2}}g^{\prime}_{\vartheta}(x;0)g^{\prime}_{\vartheta}(y;0)\,\text{d}x\text{d}y, (4.6)
J1,k=eδx2xkgϑ(x;0)dx,k=0,1,2;J2=eδx2gϑ′′(x;0)dx,J_{1,k}=\int\!{\rm e}^{-\delta x^{2}}x^{k}g^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad k=0,1,2;\qquad J_{2}=\int\!{\rm e}^{-\delta x^{2}}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x, (4.7)

some algebra gives

exp(γ(xy)2σ2(ϑ))g(x;ϑ)g(y;ϑ)dxdy\displaystyle\iint\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)g(y;\vartheta)\,\text{d}x\,\text{d}y
=\displaystyle= 11+2β2+ϑ{2J1,01+β2+2σ1γ(1+2β2)3/2}\displaystyle\frac{1}{\sqrt{1+2\beta^{2}}}+\vartheta\bigg{\{}\frac{2J_{1,0}}{\sqrt{1+\beta^{2}}}+\frac{2\sigma_{1}\gamma}{(1+2\beta^{2})^{3/2}}\bigg{\}}
+ϑ2{J21+β2+D0+2σ1γ(J1,2+(β2+1)J1,0)(1+β2)5/2+β2((2β2+1)σ2(β2+2)σ12)2(1+2β2)5/2}.\displaystyle+\vartheta^{2}\bigg{\{}\frac{J_{2}}{\sqrt{1+\beta^{2}}}+D_{0}+\frac{2\sigma_{1}\gamma\big{(}J_{1,2}+(\beta^{2}\!+\!1)J_{1,0}\big{)}}{(1+\beta^{2})^{5/2}}+\frac{\beta^{2}\big{(}(2\beta^{2}+1)\sigma_{2}-(\beta^{2}+2)\sigma_{1}^{2}\big{)}}{2(1+2\beta^{2})^{5/2}}\bigg{\}}.

As for the integral figuring in (4.3), we use (4.5). Neglecting terms that are of order O(ϑ3)O(\vartheta^{3}), straightforward but tedious calculations give

exp(δ(xμ(ϑ))2σ2(ϑ))=eδx2{1+δϑU(x)+δϑ22V(x)}+O(ϑ3),\exp\left(\!-\frac{\delta(x\!-\!\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)={\rm e}^{-\delta x^{2}}\cdot\bigg{\{}1+\delta\vartheta U(x)+\frac{\delta\vartheta^{2}}{2}V(x)\bigg{\}}+O(\vartheta^{3}),

where – recalling (4.4) –

U(x)\displaystyle U(x) =\displaystyle= σ1x2+2μ1x,\displaystyle\sigma_{1}x^{2}+2\mu_{1}x,
V(x)\displaystyle V(x) =\displaystyle= δσ12x4+4δμ1σ1x3+(4δμ122σ12+σ2)x2(4μ1σ12μ2)x2μ12.\displaystyle\delta\sigma_{1}^{2}x^{4}\!+\!4\delta\mu_{1}\sigma_{1}x^{3}\!+\!\big{(}4\delta\mu_{1}^{2}-2\sigma_{1}^{2}\!+\!\sigma_{2}\big{)}x^{2}\!-\!(4\mu_{1}\sigma_{1}\!-\!2\mu_{2})x\!-\!2\mu_{1}^{2}.

Thus,

exp(δ(xμ(ϑ))2σ2(ϑ))g(x;ϑ)dx\displaystyle\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x\!\! =\displaystyle\!\!=\!\! eδx2(1+δϑU(x)+δϑ22V(x))φ(x)dx\displaystyle\!\!\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}\varphi(x)\,\text{d}x
+ϑeδx2(1+δϑU(x)+δϑ22V(x))gϑ(x;0)dx\displaystyle\!\!+\vartheta\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}g^{\prime}_{\vartheta}(x;0)\,\text{d}x
+ϑ22eδx2(1+δϑU(x)+δϑ22V(x))gϑ′′(x;0)dx\displaystyle\!\!+\frac{\vartheta^{2}}{2}\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x
+O(ϑ3)\displaystyle\!\!+O(\vartheta^{3})
=\displaystyle= I1(ϑ)+ϑI2(ϑ)+ϑ22I3(ϑ)+O(ϑ3)\displaystyle I_{1}(\vartheta)+\vartheta I_{2}(\vartheta)+\frac{\vartheta^{2}}{2}I_{3}(\vartheta)+O(\vartheta^{3})

(say). We have

I1(ϑ)\displaystyle I_{1}(\vartheta) =\displaystyle= 1(1+2δ)1/2+ϑδσ1(1+2δ)3/2\displaystyle\frac{1}{(1+2\delta)^{1/2}}+\vartheta\cdot\frac{\delta\sigma_{1}}{(1+2\delta)^{3/2}}
+ϑ2[3δ2σ122(1+2δ)5/2+δ(4δμ122σ12+σ2)2(1+2δ)3/2δμ12(1+2δ)1/2].\displaystyle+\vartheta^{2}\bigg{[}\frac{3\delta^{2}\sigma_{1}^{2}}{2(1+2\delta)^{5/2}}+\frac{\delta(4\delta\mu_{1}^{2}-2\sigma_{1}^{2}+\sigma_{2})}{2(1+2\delta)^{3/2}}-\frac{\delta\mu_{1}^{2}}{(1+2\delta)^{1/2}}\bigg{]}.

Furthermore,

I2(ϑ)\displaystyle I_{2}(\vartheta) =\displaystyle= eδx2gϑ(x;0)dx+δϑeδx2(σ12x2+2μ1x)gϑ(x;0)dx+O(ϑ2),\displaystyle\int\!{\rm e}^{-\delta x^{2}}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+\delta\vartheta\int{\rm e}^{-\delta x^{2}}\big{(}\sigma_{1}^{2}x^{2}+2\mu_{1}x\big{)}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta^{2}),
I3(ϑ)\displaystyle I_{3}(\vartheta) =\displaystyle= eδx2gϑ′′(x;0)dx+O(ϑ).\displaystyle\int\!{\rm e}^{-\delta x^{2}}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta).

Recalling (4.7), we thus obtain – apart from a term which is O(ϑ3)O(\vartheta^{3})

exp(δ(xμ(ϑ))2σ2(ϑ))g(x;ϑ)dx\displaystyle\!\!\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x
=\displaystyle\!\!=\!\! 1(1+2δ)1/2+ϑ[δσ1(1+2δ)3/2+J1,0]\displaystyle\!\!\frac{1}{(1+2\delta)^{1/2}}+\vartheta\bigg{[}\frac{\delta\sigma_{1}}{(1+2\delta)^{3/2}}+J_{1,0}\bigg{]}
+ϑ2[J22+δσ12J1,2+2δμ1J1,1δ((δ+12)(σ22μ12)(δ2+1)σ12)(2δ+1)5/2].\displaystyle\!\!+\vartheta^{2}\bigg{[}\frac{J_{2}}{2}+\delta\sigma_{1}^{2}J_{1,2}+2\delta\mu_{1}J_{1,1}-\frac{\delta\left(\left(\delta+\frac{1}{2}\right)(\sigma_{2}-2\mu_{1}^{2})-\left(\frac{\delta}{2}+1\right)\sigma_{1}^{2}\right)}{(2\delta+1)^{5/2}}\bigg{]}.

Upon combining (4) and (4) and recalling (4.1), bTβ(ϑ)b_{T_{\beta}}(\vartheta) figuring in (4.3), (4.3) takes the form

bTβ(ϑ)=Δβϑ2+O(ϑ3)asϑ0,b_{T_{\beta}}(\vartheta)=\Delta_{\beta}\vartheta^{2}+O(\vartheta^{3})\ \text{as}\ \vartheta\to 0,

where

Δβ\displaystyle\Delta_{\beta} =\displaystyle= D0+β2(β2+1)5/2(((J1,0J1,2)σ12J1,1μ1)β2+J1,0σ12J1,1μ1)\displaystyle D_{0}+\frac{\beta^{2}}{(\beta^{2}+1)^{5/2}}\left(\left(\left(J_{1,0}-J_{1,2}\right)\sigma_{1}-2J_{1,1}\mu_{1}\right)\beta^{2}+J_{1,0}\sigma_{1}-2J_{1,1}\mu_{1}\right)
+β2(2β2+1)5/2((2μ12+34σ12)β2+μ12),\displaystyle+\frac{\beta^{2}}{(2\beta^{2}+1)^{5/2}}\left(\left(2\mu_{1}^{2}+\frac{3}{4}\sigma_{1}^{2}\right)\beta^{2}+\mu_{1}^{2}\right),

and D0D_{0} and J1,0,J1,1,J1,2J_{1,0},J_{1,1},J_{1,2} are defined in (4.6) and (4.7), respectively.

5 Approximations to solutions of the eigenvalue problem

We now turn to the conditions (2.1) and (2.2). The limit null distribution of Tn,βT_{n,\beta}, as nn\to\infty, is given by the distribution of

Tβ:=Z2(t)φβ(t)dt.T_{\beta}:=\int_{-\infty}^{\infty}Z^{2}(t)\,\varphi_{\beta}(t)\,{\rm d}t.

Here, ZZ is a centred Gaussian random element of the Fréchet space of continuous real-valued functions having covariance kernel K(s,t)=𝔼[Z(s)Z(t)]K(s,t)=\mathbb{E}[Z(s)Z(t)], where

K(s,t)=exp((st)22)(1+st+(st)22)exp(s2+t22),s,tK(s,t)=\exp\left(-\frac{(s-t)^{2}}{2}\right)-\left(1+st+\frac{(st)^{2}}{2}\right)\exp\left(-\frac{s^{2}+t^{2}}{2}\right),\quad s,t\in\mathbb{R} (5.1)

(see Theorem 2.1 and Theorem 2.2 of [12]). In fact, ZZ may also be regarded as a Gaussian random element of the separable Hilbert space L2L^{2} (say) of (equivalence classes of) functions that are square integrable with respect to φβ(t)dt\varphi_{\beta}(t){\rm d}t. The distribution of TβT_{\beta} is that of j=1λj(β)Nj2\sum_{j=1}^{\infty}\lambda_{j}(\beta)N_{j}^{2}, where N1,N2,N_{1},N_{2},\ldots is a sequence of i.i.d. standard normal random variables, and λ1(β),λ2(β),\lambda_{1}(\beta),\lambda_{2}(\beta),\ldots is the sequence of positive eigenvalues of the integral operator 𝒦{\cal K} on L2L^{2} defined by

𝒦:L2L2,f𝒦f(s)=K(s,t)f(t)φβ(t)dt,s.\mathcal{K}:L^{2}\rightarrow L^{2},\quad f\mapsto\mathcal{K}f(s)=\int_{-\infty}^{\infty}K(s,t)f(t)\varphi_{\beta}(t)\,\mbox{d}t,\quad s\in\mathbb{R}.

Since SnS_{n} figuring in (2.1) equals Tn,β\sqrt{T_{n,\beta}}, the function GG is the distribution function of Z~:=(j=1λj(β)Nj2)1/2\widetilde{Z}:=\left(\sum_{j=1}^{\infty}\lambda_{j}(\beta)N_{j}^{2}\right)^{1/2}. From [21], we thus have

log(1G(x))=log(Z~>x)=log(Z~2>x2)x22λ1(β)asx,\log(1-G(x))=\log\mathbb{P}(\widetilde{Z}>x)=\log\mathbb{P}(\widetilde{Z}^{2}>x^{2})\sim-\frac{x^{2}}{2\lambda_{1}(\beta)}\ \text{as}\ x\to\infty,

where λ1(β)\lambda_{1}(\beta) denotes the largest eigenvalue. Hence, the approximate Bahadur slope of the Epps–Pulley test statistic is given by

cTβ(ϑ)=bTβ(ϑ)λ1(β).c^{*}_{T_{\beta}}(\vartheta)=\frac{b_{T_{\beta}}(\vartheta)}{\lambda_{1}(\beta)}. (5.2)

Thus, one has to tackle the so-called eigenvalue problem, i.e., to find positive values λ\lambda and functions ff such that 𝒦f=λf\mathcal{K}f=\lambda f or, in other words, to solve the integral equation

K(s,t)f(t)φβ(t)dt=λf(s),s.\int_{-\infty}^{\infty}K(s,t)f(t)\varphi_{\beta}(t)\,\mbox{d}t=\lambda f(s),\quad s\in\mathbb{R}. (5.3)

Since explicit solutions of such integral equations are only available in exceptional cases (for non-classical goodness-of-fit test statistics, see [10] and [11]), we employ a stochastic approximation method. This method is related to the quadrature method in the classical literature on numerical mathematics (see [2], chapter 3), and which can also be found in machine learning theory (see [20]). For the approximation of spectra of Hilbert Schmidt operators, see [13]. To be specific, let YY be a random variable having density φβ\varphi_{\beta}. Then (5.3) reads

λf(s)=𝔼[K(s,Y)f(Y)],s.\lambda f(s)=\mathbb{E}\big{[}K(s,Y)f(Y)\big{]},\quad s\in\mathbb{R}. (5.4)

An empirical counterpart to (5.4) emerges if we let y1,y2,,yNy_{1},y_{2},\ldots,y_{N}, NN\in\mathbb{N}, be independent realizations of YY. An approximation of the expected value in (5.4) is then

𝔼[K(s,Y)f(Y)]1Nj=1NK(s,yj)f(yj),s.\mathbb{E}\big{[}K(s,Y)f(Y)\big{]}\approx\frac{1}{N}\sum_{j=1}^{N}K(s,y_{j})f(y_{j}),\quad s\in\mathbb{R}. (5.5)

If we evaluate (5.5) at the points y1,,yny_{1},\ldots,y_{n}, the result is

λf(yi)=1Nj=1NK(yi,yj)f(yj),i=1,,N,\lambda f(y_{i})=\frac{1}{N}\sum_{j=1}^{N}K(y_{i},y_{j})f(y_{j}),\quad i=1,\ldots,N, (5.6)

which is a system of NN linear equations. Writing v=(f(y1),,f(yN))Nv=(f(y_{1}),\ldots,f(y_{N}))\in\mathbb{R}^{N} and K~=(K(yi,yj)/N)i,j=1,,NN×N\widetilde{K}=(K(y_{i},y_{j})/N)_{i,j=1,\ldots,N}\in\mathbb{R}^{N\times N}, we can rewrite (5.6) according to

K~v=λv\widetilde{K}v=\lambda v

in matrix form, from which the (approximated) eigenvalues λ1,,λN\lambda_{1},\ldots,\lambda_{N} can be computed explicitly. Note that for each eigenvalue λj\lambda_{j} we have an eigenvector vjNv_{j}\in\mathbb{R}^{N} (say), the components of which are the (approximated) values of the eigenfunctions (say) fjf_{j} computed at y1,,yNy_{1},\ldots,y_{N}.

The simulation of eigenvalues was performed in the statistical computing language R, see [19]. As parameters for the simulation we chose N=1000N=1000, and we considered the tuning parameters β{0.25,0.5,0.75,1,2,3,5,10}\beta\in\{0.25,0.5,0.75,1,2,3,5,10\}. Each entry in Table 1 stands for the mean of 10 simulation runs.

λ\β\lambda\backslash\beta 0.25 0.5 0.75 1 2 3 5 10
λ1\lambda_{1} 0.00040 0.01065 0.03829 0.07507 0.15207 0.16149 0.13552 0.08791
λ2\lambda_{2} 0.00003 0.00304 0.01735 0.04454 0.12921 0.14577 0.12606 0.08178
λ3\lambda_{3} 0.00000 0.00021 0.00220 0.00846 0.04894 0.07676 0.08703 0.06879
λ4\lambda_{4} 0.00000 0.00004 0.00076 0.00417 0.03966 0.06642 0.07997 0.06459
λ5\lambda_{5} 0.00000 0.00000 0.00011 0.00098 0.01692 0.03755 0.05678 0.05518
Table 1: Approximate first five eigenvalues of 𝒦\mathcal{K} for different weight functions φβ\varphi_{\beta}, each entry is the mean of 10 simulation runs

6 Alternatives

As in Milos̆ević et al. ([15]), we consider the following close alternatives:

  • a Lehmann alternative with density

    g1(x;ϑ)=(1+ϑ)Φϑ(x)φ(x);g_{1}(x;\vartheta)=(1+\vartheta)\Phi^{\vartheta}(x)\varphi(x);
  • a first Ley-Paindaveine alternative with density (see [14])

    g2(x;ϑ)=φ(x)eϑ(1Φ(x))(1+ϑΦ(x));g_{2}(x;\vartheta)=\varphi(x){\rm e}^{-\vartheta(1-\Phi(x))}(1+\vartheta\Phi(x));
  • a second Ley-Paindaveine alternative with density (see [14])

    g3(x;ϑ)=φ(x)(1ϑπcos(πΦ(x)));g_{3}(x;\vartheta)=\varphi(x)(1-\vartheta\pi\cos(\pi\Phi(x)));
  • a contamination alternative (with N(μ,σ2)\text{N}(\mu,\sigma^{2}) for several pairs (μ,σ2)(0,1)(\mu,\sigma^{2})\neq(0,1)) with density

    g4[μ,σ2](x;ϑ)=(1ϑ)φ(x)+ϑσφ(xμσ).g_{4}^{[\mu,\sigma^{2}]}(x;\vartheta)=(1-\vartheta)\varphi(x)+\frac{\vartheta}{\sigma}\varphi\left(\frac{x-\mu}{\sigma}\right).

Like in Milos̆ević et al. ([15]), we computed the local (as ϑ0\vartheta\to 0) relative approximate Bahadur efficiencies with respect to the likelihood ratio test (LRT). The LRT is the best test regarding exact Bahadur effiency, and it is often used as a benchmark test. Table 2 displays the local approximate Bahadur efficiencies of Tn,βT_{n,\beta} with respect to the LRT, for each of the six alternatives considered in [15], and for β{0.25,0.5,0.75,1,2,4,5,10}\beta\in\{0.25,0.5,0.75,1,2,4,5,10\}. A comparison with Table 1 of [15] shows that the Epps–Pulley test with β=0.5\beta=0.5 dominates the Kolmogorov–Smirnov test for each of the six alternatives, and for β=1\beta=1, β=2\beta=2 and β=3\beta=3, it outperforms the tests of Cramér–von Mises, the Watson variation of this test, and the Watson–Darling variation of the Kolmogorov–Smirnov test, respectively. If β=0.75\beta=0.75, the Epps–Pulley test dominates the Anderson–Darling test for each of the alternatives with the exception of the final contamination alternative. As a conclusion, the test of Epps and Pulley should receive more attention as a test for normality.

Alt.\β\backslash\beta 0.25 0.5 0.75 1 2 3 5 10
Lehmann 0.996 0.895 0.854 0.743 0.514 0.406 0.328 0.267
1st Ley-Paindaveine 0.947 0.944 0.998 0.937 0.745 0.612 0.507 0.417
2nd Ley-Paindaveine 0.824 0.872 0.986 0.981 0.881 0.754 0.641 0.533
Contamination with N(1,1) 0.760 0.649 0.592 0.499 0.328 0.255 0.205 0.166
Contamination with N(0.5,1) 0.945 0.824 0.766 0.654 0.438 0.343 0.276 0.224
Contamination with N(0,0.5) 0.084 0.267 0.474 0.587 0.675 0.606 0.526 0.442
Table 2: Approximate local Bahadur efficiency of Tn,βT_{n,\beta} with respect to the LRT

References

  • [1] R. Bahadur. Rates of convergence of estimates and test statistics. The Annals of Mathematical Statistics, 38(2):303 – 324, 1967.
  • [2] C. T. H. Baker. The numerical treatment of integral equations. Oxford University Press, 1977.
  • [3] L. Baringhaus, R. Danschke, and N. Henze. Recent and classical tests for normality - a comparative study. Communications in Statistics - Simulation and Computation, 18:363–379, 1989.
  • [4] L. Baringhaus, B. Ebner, and N. Henze. The limit distribution of weighted L2{L}^{2}-goodness-of-fit statistics under fixed alternatives, with applications. Annals of the Institute of Statistical Mathematics, 69(5):969–995, 2017.
  • [5] L. Baringhaus and N. Henze. A consistent test for multivariate normality based on the empirical characteristic function. Metrika, 35(1):339–348, Dec 1988.
  • [6] S. Csörgő. Consistency of some tests for multivariate normality. Metrika, 36:107–116, 1989.
  • [7] B. Ebner and N. Henze. Tests for multivariate normality—a critical review with emphasis on weighted L2{L}^{2}-statistics. TEST, 29(4):845–892, Dec 2020.
  • [8] B. Ebner, N. Henze, and Y. Y. Nikitin. Integral distribution-free statistics of lpl_{p}-type and their asymptotic comparison. Computational Statistics & Data Analysis, 53(9):3426–3438, 2009.
  • [9] T. W. Epps and L. B. Pulley. A test for normality based on the empirical characteristic function. Biometrika, 70(3):723–726, 1983.
  • [10] N. Henze and Y. Y. Nikitin. A new approach to goodness-of-fit testing based on the integrated empirical process. Journal of Nonparametric Statistics, 12(3):391–416, 2000.
  • [11] N. Henze and Y. Y. Nikitin. Watson-type goodness-of-fit tests based on the integrated empirical process. Mathematical Methods of Statistics, 11(2):183–202, 2002.
  • [12] N. Henze and T. Wagner. A new approach to the bhep tests for multivariate normality. Journal of Multivariate Analysis, 62(1):1–23, 1997.
  • [13] V. Koltchinskii and E. Giné. Random matrix approximation of spectra of integral operators. Bernoulli, 6(1):113–167, 2000.
  • [14] C. Ley and D. Paindaveine. Le cam optimal tests for symmetry against ferreira and steel’s general skewed distributions. Journal of Nonparametric Statistics, 21(8):943–967, 2009.
  • [15] B. Milos̆ević, Y. Y. Nikitin, and M. Obradović. Bahadur efficiency of edf based normality tests when parameters are estimated. ArXiv e-prints, arXiv:2106.07437, 2021.
  • [16] Y. Nikitin. Asymptotic Efficiency of Nonparametric Tests. Cambridge University Press, 1995.
  • [17] Y. Y. Nikitin and I. Peaucelle. Efficiency and local optimality of nonparametric tests based on uu- and vv-statistics. METRON, 62(2):185–200, 2004.
  • [18] Y. Y. Nikitin and A. V. Tchirina. Lilliefors test for exponentiality: Large deviations, asymptotic efficiency, and conditions of local optimality. Mathematical Methods of Statistics, 16(1):16–24, 2007.
  • [19] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2021.
  • [20] C. Rasmussen and K. Christopher. Gaussian Processes for Machine Learning. MIT Press, 2006.
  • [21] V. M. Zolotarev. Concerning a certain probability problem. Theory of Probatility and its Applications, 6(2), 1961.

B. Ebner and N. Henze,
Institute of Stochastics,
Karlsruhe Institute of Technology (KIT),
Englerstr. 2, D-76133 Karlsruhe.
E-mail: Bruno.Ebner@kit.edu
E-mail: Norbert.Henze@kit.edu