Bahadur efficiencies of the Epps–Pulley test for normality

Bruno Ebner and Norbert Henze

Abstract

The test for normality suggested by Epps and Pulley [9] is a serious competitor to tests based on the empirical distribution function. In contrast to the latter procedures, it has been generalized to obtain a genuine affine invariant and universally consistent test for normality in any dimension. We obtain approximate Bahadur efficiencies for the test of Epps and Pulley, thus complementing recent results of Milos̆ević et al. (see [15]). For certain values of a tuning parameter that is inherent in the Epps–Pulley test, this test outperforms each of its competitors considered in [15], over the whole range of six close alternatives to normality.

In memoriam Yakov Yu. Nikitin (1947–2020)

⁰⁰footnotetext: MSC 2010 subject classifications. Primary 62F05; Secondary 62G20⁰⁰footnotetext: Test for normality; empirical characteristic function; Bahadur efficiency; eigenvalues of integral operators

1 Introduction

The purpose of this article is to derive Bahadur efficiencies for the test of normality proposed by Epps and Pulley [9], thus complementing recent results of Milos̆ević et al. [15], who confined their study to tests of normality based on the empirical distribution function. To be specific, suppose $X_{1},X_{2},\ldots$ is a sequence of independent and identically distributed (i.i.d.) copies of a random variable $X$ that has an absolutely continuous distribution with respect to Lebesgue measure. To test the hypothesis $H_{0}$ that the distribution of $X$ is some unspecified non-degenerate normal distribution, Epps and Pulley [9] proposed to use the test statistic

T_{n,\beta}=n\int_{-\infty}^{\infty}\Big{|}\psi_{n}(t)-{\rm e}^{-t^{2}/2}\Big{|}^{2}\,\varphi_{\beta}(t)\,{\rm d}t.

Here, $\psi_{n}(t)=n^{-1}\sum_{j=1}^{n}\exp\big{(}{\rm i}tY_{n,j}\big{)}$ is the empirical characteristic function of the so-called scaled residuals $Y_{n,1},\ldots,Y_{n,n}$ , where $Y_{n,j}=S_{n}^{-1}(X_{j}-\overline{X}_{n})$ , $j=1,\ldots,n$ , and $\overline{X}_{n}=n^{-1}\sum_{j=1}^{n}X_{j}$ , $S_{n}^{2}=n^{-1}\sum_{j=1}^{n}(X_{j}-\overline{X}_{n})^{2}$ are the sample mean and the sample variance of $X_{1},\ldots,X_{n}$ , respectively, and $\beta>0$ is a so-called tuning parameter. Moreover,

\varphi_{\beta}(t)=\frac{1}{\beta\sqrt{2\pi}}\exp\Big{(}-\frac{t^{2}}{2\beta^{2}}\Big{)},\quad t\in\mathbb{R},

is the density of the centred normal distribution with variance $\beta^{2}$ . A closed-form expression of $T_{n,\beta}$ that is amenable to computational purposes is

T_{n,\beta}=\frac{1}{n}\sum_{j,k=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\big{(}Y_{n,j}\!-\!Y_{n,k}\big{)}^{2}\!\bigg{)}\!-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\sum_{j=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}Y_{n,j}^{2}}{2(1\!+\!\beta^{2})}\!\bigg{)}+\frac{n}{\sqrt{1\!+\!2\beta^{2}}}.

(1.1)

Epps and Pulley did not obtain neither the limit null distribution of $T_{n,\beta}$ as $n\to\infty$ nor the consistency of a test for normality that rejects $H_{0}$ for large values of $T_{n,\beta}$ . Their procedure, however, turned out to be a serious competitor to the classical tests of Shapiro–Wilk, Shapiro–Francia and Anderson–Darling in simulation studies (see [3]). In the special case $\beta=1$ , Baringhaus and Henze [5] generalized the approach of Epps and Pulley to obtain a genuine test of multivariate normality, and they derived the limit null distribution of $T_{n,1}$ . Moreover, they proved the consistency of the test of Epps and Pulley against each alternative to normality having a finite second moment. The latter restriction was removed by S. Csörgő [6]. By an approach different from that adopted in [5], Henze and Wagner [12] obtained both the limit null distribution and the limit distribution of $T_{n,\beta}$ under contiguous alternatives to normality. Under fixed alternatives to normality, the limit distribution of $T_{n,\beta}$ is normal, as elaborated by [4] in much greater generality for weighted $L^{2}$ -statistics. For more information on $T_{n,\beta}$ , especially on the role of the tuning parameter $\beta$ , see Section 2.2 of [7].

Notice that $T_{n,\beta}$ is invariant with respect to affine transformations $X_{j}\mapsto aX_{j}+b$ , where $a,b\in\mathbb{R}$ and $a\neq 0$ . Hence, under $H_{0}$ , both the finite-sample and the asymptotic distribution of $T_{n,\beta}$ do not depend on the parameters $\mu$ and $\sigma^{2}$ of the underlyling normal distribution N $(\mu,\sigma^{2})$ . Under $H_{0}$ , we will thus assume $\mu=0$ and $\sigma^{2}=1$ . The rest of the paper unfolds as follows: In Section 2, we revisit the notion of approximate Bahadur efficiency. Sections 3 and 4 deal with stochastic limits and local Bahadur slopes, and Section 5 tackles an eigenvalue problem connected with the limit null distribution of the test statistic. The final section 6 contains results regarding local approximate Bahadur efficiencies of the Epps–Pulley test for the six close alternatives considered in [15] and a wide spectrum of values of the tuning parameter $\beta$ .

2 Approximate Bahadur efficiency

There are several options to compare different tests for the same testing problem as the sample size $n$ tends to infinity, see [16]. One of these options is asymptotic efficiency due to Bahadur (see [1]). This notion of asymptotic efficiency requires knowledge of the large deviation function of the test statistic. Apart from the notable exception given in [18], such knowledge, however, is hitherto not available for statistics that contain estimated parameters, like $T_{n,\beta}$ given in (1.1). To circumvent this drawback, one usually employs the so-called approximate Bahadur efficiency, which only requires results on the tail behavior of the limit distribution of the test statistic under the null hypothesis. To be more specific with respect to the title of this paper, let $X,X_{1},X_{2},\ldots$ be a sequence of i.i.d. random variables, where the distribution of $X$ depends on a real-valued parameter $\vartheta\in\Theta$ , where $\Theta$ denotes the parameter space, and only the case $\vartheta=0$ corresponds to the case that the distribution of $X$ is standard normal. Suppose, $(S_{n})_{n\geq 1}$ , where $S_{n}=S_{n}(X_{1},\ldots,X_{n})$ , is a sequence of test statistics of the hypothesis $H_{0}:\vartheta=0$ against the alternative $H_{1}$ : $\vartheta\in\Theta\setminus\{0\}$ . Furthermore, suppose that rejection of $H_{0}$ is for large values of $S_{n}$ . The sequence $(S_{n})$ is called a standard sequence, if the following conditions hold (see, e.g., [16], p.10, or [8], p. 3427):

•

There is a continuous distribution function $G$ such that, for $\vartheta=0$ ,

$\lim_{n\to\infty}\mathbb{P}_{0}(S_{n}\leq x)=G(x),\quad x\in\mathbb{R}.$ (2.1)
•

There is a constant $a_{S}$ , $0<a_{S}<\infty$ , such that

$\log(1-G(x))=-\frac{a_{S}x^{2}}{2}(1+o(1))\ \text{as}\ x\to\infty.$ (2.2)
•

There is a real-valued function $b_{S}(\vartheta)$ on $\Theta\setminus\{0\}$ , with $0<b_{S}(\vartheta)<\infty$ , such that, for each $\vartheta\in\Theta\setminus\{0\}$ ,

$\frac{S_{n}}{\sqrt{n}}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}}b_{S}(\vartheta).$ (2.3)

Then the so-called approximate Bahadur slope

c_{S}^{*}(\vartheta)=a_{S}\cdot b_{S}^{2}(\vartheta),\quad\vartheta\in\Theta\setminus\{0\},

is a measure of approximate Bahadur efficiency. Usually, it is true that $c_{S}^{*}(\vartheta)\sim\ell(S)\cdot\vartheta^{2}$ as $\vartheta\to 0$ . In this case $\ell(S)$ is called the local (approximate) index of the sequence $(S_{n})$ . We will see that the sequence $(S_{n})$ , where $S_{n}:=\sqrt{T_{n,\beta}}$ , is a standard sequence. To this end, we will derive the stochastic limit of $T_{n,\beta}/n$ for a general alternative in Section 3. In Section 4, we will specialize this stochastic limit for local alternatives, and we will derive the local index for the Epps–Pulley test statistic.

3 Stochastic limit of $T_{n,\beta}/n$

To calculate the asymptotic Bahadur efficiency of the test of Epps and Pulley, we need the following result.

Theorem 3.1.

Suppose that $\mathbb{E}(X^{2})<\infty$ . Then

\frac{T_{n\beta}}{n}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}$}}}{{\longrightarrow}}\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}(Y_{1}-Y_{2})^{2}}{2}\right)\bigg{]}-\frac{2}{\sqrt{1+\beta^{2}}}\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}Y_{1}^{2}}{2(1\!+\!\beta^{2})}\right)\bigg{]}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}.

Here, $\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}$}}}{{\longrightarrow}}$ denotes convergence in probability, and $Y_{j}=(X_{j}-\mu)/\sigma$ , $j\geq 1$ , where $\mu=\mathbb{E}(X)$ and $\sigma^{2}=\mathbb{V}(X)$ .

Proof. From (1.1), we have

$\displaystyle\frac{T_{n,\beta}}{n}$	$\displaystyle=$	$\displaystyle\frac{1}{n^{2}}\sum_{j,k=1}^{n}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\left(\frac{X_{j}-X_{k}}{S_{n}}\right)^{2}\!\bigg{)}$
		$\displaystyle-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\cdot\frac{1}{n}\sum_{j=1}^{n}\exp\!\bigg{(}\!-\frac{\beta^{2}}{2(1\!+\!\beta^{2})}\left(\frac{X_{j}-\overline{X}_{n}}{S_{n}}\right)^{2}\!\bigg{)}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}$
	$\displaystyle=:$	$\displaystyle A_{n,1}-\frac{2}{\sqrt{1\!+\!\beta^{2}}}\cdot A_{n,2}+\frac{1}{\sqrt{1\!+\!2\beta^{2}}}$

(say). By symmetry, it follows that

	$\displaystyle\mathbb{E}\big{(}A_{n,1}\big{)}$	$\displaystyle=$	$\displaystyle\frac{1}{n}+\frac{n-1}{n}\cdot\mathbb{E}\bigg{[}\exp\!\bigg{(}\!\!-\frac{\beta^{2}}{2}\left(\frac{X_{1}-X_{2}}{S_{n}}\right)^{2}\!\bigg{]},$
	$\displaystyle\mathbb{E}\big{(}A_{n,2}\big{)}$	$\displaystyle=$	$\displaystyle\mathbb{E}\bigg{[}\exp\!\bigg{(}\!-\frac{\beta^{2}}{2(1\!+\!\beta^{2})}\left(\frac{X_{1}-\overline{X}_{n}}{S_{n}}\right)^{2}\!\bigg{)}\bigg{]}.$

Since $\overline{X}_{n}\rightarrow\mu$ and $S_{n}\rightarrow\sigma$ almost surely as $n\to\infty$ by the strong law of large numbers, it follows from Lebesgue’s dominated convergence theorem that

\lim_{n\to\infty}\mathbb{E}\big{(}A_{n,1}\big{)}=\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}(Y_{1}-Y_{2})^{2}}{2}\right)\bigg{]},\quad\lim_{n\to\infty}\mathbb{E}\big{(}A_{n,2}\big{)}=\mathbb{E}\bigg{[}\exp\left(-\frac{\beta^{2}Y_{1}^{2}}{2(1\!+\!\beta^{2})}\right)\bigg{]},

and thus the expectation of $T_{n,\beta}/n$ converges to the stochastic limit figuring in Theorem 3.1. Likewise, the variance of $T_{n,\beta}/n$ converges to zero.

4 Local Bahadur slopes

As was done in Milos̆ević et al. [15], we now assume that ${\cal G}=\{G(x;\vartheta)\}$ is a family of distribution functions (DF’s) with densities $g(x;\vartheta)$ , such that $\vartheta=0$ corresponds to the standard normal DF $\Phi$ and density $\varphi$ , and each of the distributions for $\vartheta\neq 0$ is non-normal. Moreover, we assume that the regularity assumptions WD in [17] are satisfied. If $X,X_{1},X_{2},\ldots$ are i.i.d. random variables with DF $G(\cdot;\vartheta)$ , we have to consider the stochastic limit figuring in Theorem 3.1 as a function of $\vartheta$ and expand this function at $\vartheta=0$ . To this end, let

\gamma=\frac{\beta^{2}}{2},\ \delta=\frac{\beta^{2}}{2(1+\beta^{2})}.

(4.1)

Then, putting

	$\displaystyle\mu(\vartheta)$	$\displaystyle=$	$\displaystyle\int xg(x;\vartheta)\,\text{d}x,$
	$\displaystyle\sigma^{2}(\vartheta)$	$\displaystyle=$	$\displaystyle\int x^{2}g(x;\vartheta)\,\text{d}x-\mu^{2}(\vartheta),$

Theorem 3.1 yields

\frac{T_{n,\beta}}{n}\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}}b_{T_{\beta}}(\vartheta),

where $\stackrel{{\scriptstyle\mbox{\scriptsize$\mathbb{P}_{\vartheta}$}}}{{\longrightarrow}}$ denotes convergence in probability under the true parameter $\vartheta$ , and

	$\displaystyle b_{T_{\beta}}(\vartheta)$	$\displaystyle=$	$\displaystyle\iint\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)g(y;\vartheta)\,\text{d}x\,\text{d}y$		(4.3)
			$\displaystyle\hskip 14.22636pt-\frac{2}{\sqrt{1+\beta^{2}}}\,\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x+\frac{1}{\sqrt{1+2\beta^{2}}}.$		(4.3)

Here and in what follows, each unspecified integral is over $\mathbb{R}$ .

Notice that $b_{T_{\beta}}(0)=0$ . We have to find the quadratic (first non-vanishing) term in the Taylor expansion of $b_{T_{\beta}}$ around zero, i.e, we look for some (local index) $\Delta_{\beta}>0$ such that

b_{T_{\beta}}(\vartheta)=\Delta_{\beta}\,\vartheta^{2}+o(\vartheta^{2})\quad\text{as}\ \vartheta\to 0.

Writing $g^{\prime}_{\vartheta}(x;\vartheta)$ , $g^{\prime\prime}_{\vartheta}(x;\vartheta)$ for derivatives of $g(x;\vartheta)$ with respect to $\vartheta$ , we have

g(x;\vartheta)=\varphi(x)+\vartheta\cdot g^{\prime}_{\vartheta}(x;0)+\frac{\vartheta^{2}}{2}g^{\prime\prime}_{\vartheta}(x;0)+O(\vartheta^{3})

and thus – since $\mu(0)=0$ and $\sigma^{2}(0)=1$ –

	$\displaystyle\mu(\vartheta)$	$\displaystyle=$	$\displaystyle\vartheta\int xg^{\prime}_{\vartheta}(x;0)\,\text{d}x+\frac{\vartheta^{2}}{2}\int xg^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta^{3}),$
	$\displaystyle\sigma^{2}(\vartheta)$	$\displaystyle=$	$\displaystyle 1+\vartheta\int x^{2}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+\frac{\vartheta^{2}}{2}\int x^{2}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x-\mu(\vartheta)^{2}+O(\vartheta^{3}).$

Consequently, putting

\mu_{1}:=\mu^{\prime}(0),\ \mu_{2}:=\mu^{\prime\prime}(0),\ \sigma_{1}:=(\sigma^{2})^{\prime}(0),\ \sigma_{2}:=(\sigma^{2})^{\prime\prime}(0)

(4.4)

for the sake of brevity, it follows that

	$\displaystyle\mu_{1}$	$\displaystyle=$	$\displaystyle\int xg^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad\mu_{2}=\int xg^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x,$
	$\displaystyle\sigma_{1}$	$\displaystyle=$	$\displaystyle\int x^{2}g^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad\sigma_{2}=\int x^{2}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x-2\mu^{\prime}(0)^{2}.$

To tackle the integral that figures in (4.3), notice that

	$\displaystyle g(x;\vartheta)g(y;\vartheta)$	$\displaystyle=$	$\displaystyle\varphi(x)\varphi(y)+\vartheta\big{[}g^{\prime}_{\vartheta}(x;0)\varphi(y)+g^{\prime}_{\vartheta}(y;0)\varphi(x)\big{]}$
			$\displaystyle+\vartheta^{2}\Big{[}\frac{1}{2}g^{\prime\prime}_{\vartheta}(x;0)\varphi(y)+\frac{1}{2}g^{\prime\prime}_{\vartheta}(y;0)\varphi(x)+g^{\prime}_{\vartheta}(x;0)g^{\prime}_{\vartheta}(y;0)\Big{]}+O(\vartheta^{3}).$

Moreover, it follows from a geometric series expansion that

\frac{1}{\sigma^{2}(\vartheta)}=1-\vartheta\sigma_{1}+\vartheta^{2}\Big{[}\sigma_{1}^{2}-\frac{\sigma_{2}}{2}\Big{]}+O(\vartheta^{3})

(4.5)

(say). From an expansion of the exponential function, we thus obtain

			$\displaystyle\!\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)$
		$\displaystyle\!=\!$	$\displaystyle\!{\rm e}^{-\gamma(x-y)^{2}}\Big{[}1+\vartheta\sigma_{1}\gamma(x\!-\!y)^{2}+\vartheta^{2}\Big{\{}\frac{1}{2}\,\sigma_{1}^{2}\gamma^{2}(x\!-\!y)^{4}-\Big{(}\sigma_{1}^{2}\!-\!\frac{\sigma_{2}}{2}\Big{)}\gamma(x\!-\!y)^{2}\Big{\}}\Big{]}+O(\vartheta^{3}).$

Using

\iint{\rm e}^{-\gamma(x-y)^{2}}(x-y)^{2k}\varphi(x)\varphi(y)\,\text{d}x\text{d}y=\frac{4^{k}\Gamma(k+1/2)}{\sqrt{\pi}(4\gamma+1)^{k+1/2}},\quad k=0,1,2,

\int{\rm e}^{-\gamma(x-y)^{2}}\varphi(x)\,\text{d}x=\frac{1}{\sqrt{1+\beta^{2}}}\cdot{\rm e}^{-\delta y^{2}}

\int{\rm e}^{-\gamma(x-y)^{2}}(x-y)^{2}\varphi(y)\,\text{d}y={\rm e}^{-\delta x^{2}}\cdot\frac{x^{2}+\beta^{2}+1}{(1+\beta^{2})^{5/2}},

and putting

D_{0}=\iint{\rm e}^{-\gamma(x-y)^{2}}g^{\prime}_{\vartheta}(x;0)g^{\prime}_{\vartheta}(y;0)\,\text{d}x\text{d}y,

(4.6)

J_{1,k}=\int\!{\rm e}^{-\delta x^{2}}x^{k}g^{\prime}_{\vartheta}(x;0)\,\text{d}x,\quad k=0,1,2;\qquad J_{2}=\int\!{\rm e}^{-\delta x^{2}}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x,

(4.7)

some algebra gives

			$\displaystyle\iint\exp\left(-\frac{\gamma(x-y)^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)g(y;\vartheta)\,\text{d}x\,\text{d}y$
		$\displaystyle=$	$\displaystyle\frac{1}{\sqrt{1+2\beta^{2}}}+\vartheta\bigg{\{}\frac{2J_{1,0}}{\sqrt{1+\beta^{2}}}+\frac{2\sigma_{1}\gamma}{(1+2\beta^{2})^{3/2}}\bigg{\}}$
			$\displaystyle+\vartheta^{2}\bigg{\{}\frac{J_{2}}{\sqrt{1+\beta^{2}}}+D_{0}+\frac{2\sigma_{1}\gamma\big{(}J_{1,2}+(\beta^{2}\!+\!1)J_{1,0}\big{)}}{(1+\beta^{2})^{5/2}}+\frac{\beta^{2}\big{(}(2\beta^{2}+1)\sigma_{2}-(\beta^{2}+2)\sigma_{1}^{2}\big{)}}{2(1+2\beta^{2})^{5/2}}\bigg{\}}.$

As for the integral figuring in (4.3), we use (4.5). Neglecting terms that are of order $O(\vartheta^{3})$ , straightforward but tedious calculations give

\exp\left(\!-\frac{\delta(x\!-\!\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)={\rm e}^{-\delta x^{2}}\cdot\bigg{\{}1+\delta\vartheta U(x)+\frac{\delta\vartheta^{2}}{2}V(x)\bigg{\}}+O(\vartheta^{3}),

where – recalling (4.4) –

	$\displaystyle U(x)$	$\displaystyle=$	$\displaystyle\sigma_{1}x^{2}+2\mu_{1}x,$
	$\displaystyle V(x)$	$\displaystyle=$	$\displaystyle\delta\sigma_{1}^{2}x^{4}\!+\!4\delta\mu_{1}\sigma_{1}x^{3}\!+\!\big{(}4\delta\mu_{1}^{2}-2\sigma_{1}^{2}\!+\!\sigma_{2}\big{)}x^{2}\!-\!(4\mu_{1}\sigma_{1}\!-\!2\mu_{2})x\!-\!2\mu_{1}^{2}.$

Thus,

$\displaystyle\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x\!\!$	$\displaystyle\!\!=\!\!$	$\displaystyle\!\!\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}\varphi(x)\,\text{d}x$
		$\displaystyle\!\!+\vartheta\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}g^{\prime}_{\vartheta}(x;0)\,\text{d}x$
		$\displaystyle\!\!+\frac{\vartheta^{2}}{2}\int\!{\rm e}^{-\delta x^{2}}\Big{(}1\!+\!\delta\vartheta U(x)\!+\!\frac{\delta\vartheta^{2}}{2}V(x)\Big{)}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x$
		$\displaystyle\!\!+O(\vartheta^{3})$
	$\displaystyle=$	$\displaystyle I_{1}(\vartheta)+\vartheta I_{2}(\vartheta)+\frac{\vartheta^{2}}{2}I_{3}(\vartheta)+O(\vartheta^{3})$

(say). We have

	$\displaystyle I_{1}(\vartheta)$	$\displaystyle=$	$\displaystyle\frac{1}{(1+2\delta)^{1/2}}+\vartheta\cdot\frac{\delta\sigma_{1}}{(1+2\delta)^{3/2}}$
			$\displaystyle+\vartheta^{2}\bigg{[}\frac{3\delta^{2}\sigma_{1}^{2}}{2(1+2\delta)^{5/2}}+\frac{\delta(4\delta\mu_{1}^{2}-2\sigma_{1}^{2}+\sigma_{2})}{2(1+2\delta)^{3/2}}-\frac{\delta\mu_{1}^{2}}{(1+2\delta)^{1/2}}\bigg{]}.$

Furthermore,

	$\displaystyle I_{2}(\vartheta)$	$\displaystyle=$	$\displaystyle\int\!{\rm e}^{-\delta x^{2}}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+\delta\vartheta\int{\rm e}^{-\delta x^{2}}\big{(}\sigma_{1}^{2}x^{2}+2\mu_{1}x\big{)}g^{\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta^{2}),$
	$\displaystyle I_{3}(\vartheta)$	$\displaystyle=$	$\displaystyle\int\!{\rm e}^{-\delta x^{2}}g^{\prime\prime}_{\vartheta}(x;0)\,\text{d}x+O(\vartheta).$

Recalling (4.7), we thus obtain – apart from a term which is $O(\vartheta^{3})$ –

			$\displaystyle\!\!\int\exp\left(-\frac{\delta(x-\mu(\vartheta))^{2}}{\sigma^{2}(\vartheta)}\right)g(x;\vartheta)\,\text{d}x$
		$\displaystyle\!\!=\!\!$	$\displaystyle\!\!\frac{1}{(1+2\delta)^{1/2}}+\vartheta\bigg{[}\frac{\delta\sigma_{1}}{(1+2\delta)^{3/2}}+J_{1,0}\bigg{]}$
			$\displaystyle\!\!+\vartheta^{2}\bigg{[}\frac{J_{2}}{2}+\delta\sigma_{1}^{2}J_{1,2}+2\delta\mu_{1}J_{1,1}-\frac{\delta\left(\left(\delta+\frac{1}{2}\right)(\sigma_{2}-2\mu_{1}^{2})-\left(\frac{\delta}{2}+1\right)\sigma_{1}^{2}\right)}{(2\delta+1)^{5/2}}\bigg{]}.$

Upon combining (4) and (4) and recalling (4.1), $b_{T_{\beta}}(\vartheta)$ figuring in (4.3), (4.3) takes the form

b_{T_{\beta}}(\vartheta)=\Delta_{\beta}\vartheta^{2}+O(\vartheta^{3})\ \text{as}\ \vartheta\to 0,

where

	$\displaystyle\Delta_{\beta}$	$\displaystyle=$	$\displaystyle D_{0}+\frac{\beta^{2}}{(\beta^{2}+1)^{5/2}}\left(\left(\left(J_{1,0}-J_{1,2}\right)\sigma_{1}-2J_{1,1}\mu_{1}\right)\beta^{2}+J_{1,0}\sigma_{1}-2J_{1,1}\mu_{1}\right)$
			$\displaystyle+\frac{\beta^{2}}{(2\beta^{2}+1)^{5/2}}\left(\left(2\mu_{1}^{2}+\frac{3}{4}\sigma_{1}^{2}\right)\beta^{2}+\mu_{1}^{2}\right),$

and $D_{0}$ and $J_{1,0},J_{1,1},J_{1,2}$ are defined in (4.6) and (4.7), respectively.

5 Approximations to solutions of the eigenvalue problem

We now turn to the conditions (2.1) and (2.2). The limit null distribution of $T_{n,\beta}$ , as $n\to\infty$ , is given by the distribution of

T_{\beta}:=\int_{-\infty}^{\infty}Z^{2}(t)\,\varphi_{\beta}(t)\,{\rm d}t.

Here, $Z$ is a centred Gaussian random element of the Fréchet space of continuous real-valued functions having covariance kernel $K(s,t)=\mathbb{E}[Z(s)Z(t)]$ , where

K(s,t)=\exp\left(-\frac{(s-t)^{2}}{2}\right)-\left(1+st+\frac{(st)^{2}}{2}\right)\exp\left(-\frac{s^{2}+t^{2}}{2}\right),\quad s,t\in\mathbb{R}

(5.1)

(see Theorem 2.1 and Theorem 2.2 of [12]). In fact, $Z$ may also be regarded as a Gaussian random element of the separable Hilbert space $L^{2}$ (say) of (equivalence classes of) functions that are square integrable with respect to $\varphi_{\beta}(t){\rm d}t$ . The distribution of $T_{\beta}$ is that of $\sum_{j=1}^{\infty}\lambda_{j}(\beta)N_{j}^{2}$ , where $N_{1},N_{2},\ldots$ is a sequence of i.i.d. standard normal random variables, and $\lambda_{1}(\beta),\lambda_{2}(\beta),\ldots$ is the sequence of positive eigenvalues of the integral operator ${\cal K}$ on $L^{2}$ defined by

\mathcal{K}:L^{2}\rightarrow L^{2},\quad f\mapsto\mathcal{K}f(s)=\int_{-\infty}^{\infty}K(s,t)f(t)\varphi_{\beta}(t)\,\mbox{d}t,\quad s\in\mathbb{R}.

Since $S_{n}$ figuring in (2.1) equals $\sqrt{T_{n,\beta}}$ , the function $G$ is the distribution function of $\widetilde{Z}:=\left(\sum_{j=1}^{\infty}\lambda_{j}(\beta)N_{j}^{2}\right)^{1/2}$ . From [21], we thus have

\log(1-G(x))=\log\mathbb{P}(\widetilde{Z}>x)=\log\mathbb{P}(\widetilde{Z}^{2}>x^{2})\sim-\frac{x^{2}}{2\lambda_{1}(\beta)}\ \text{as}\ x\to\infty,

where $\lambda_{1}(\beta)$ denotes the largest eigenvalue. Hence, the approximate Bahadur slope of the Epps–Pulley test statistic is given by

c^{*}_{T_{\beta}}(\vartheta)=\frac{b_{T_{\beta}}(\vartheta)}{\lambda_{1}(\beta)}.

(5.2)

Thus, one has to tackle the so-called eigenvalue problem, i.e., to find positive values $\lambda$ and functions $f$ such that $\mathcal{K}f=\lambda f$ or, in other words, to solve the integral equation

\int_{-\infty}^{\infty}K(s,t)f(t)\varphi_{\beta}(t)\,\mbox{d}t=\lambda f(s),\quad s\in\mathbb{R}.

(5.3)

Since explicit solutions of such integral equations are only available in exceptional cases (for non-classical goodness-of-fit test statistics, see [10] and [11]), we employ a stochastic approximation method. This method is related to the quadrature method in the classical literature on numerical mathematics (see [2], chapter 3), and which can also be found in machine learning theory (see [20]). For the approximation of spectra of Hilbert Schmidt operators, see [13]. To be specific, let $Y$ be a random variable having density $\varphi_{\beta}$ . Then (5.3) reads

\lambda f(s)=\mathbb{E}\big{[}K(s,Y)f(Y)\big{]},\quad s\in\mathbb{R}.

(5.4)

An empirical counterpart to (5.4) emerges if we let $y_{1},y_{2},\ldots,y_{N}$ , $N\in\mathbb{N}$ , be independent realizations of $Y$ . An approximation of the expected value in (5.4) is then

\mathbb{E}\big{[}K(s,Y)f(Y)\big{]}\approx\frac{1}{N}\sum_{j=1}^{N}K(s,y_{j})f(y_{j}),\quad s\in\mathbb{R}.

(5.5)

If we evaluate (5.5) at the points $y_{1},\ldots,y_{n}$ , the result is

\lambda f(y_{i})=\frac{1}{N}\sum_{j=1}^{N}K(y_{i},y_{j})f(y_{j}),\quad i=1,\ldots,N,

(5.6)

which is a system of $N$ linear equations. Writing $v=(f(y_{1}),\ldots,f(y_{N}))\in\mathbb{R}^{N}$ and $\widetilde{K}=(K(y_{i},y_{j})/N)_{i,j=1,\ldots,N}\in\mathbb{R}^{N\times N}$ , we can rewrite (5.6) according to

\widetilde{K}v=\lambda v

in matrix form, from which the (approximated) eigenvalues $\lambda_{1},\ldots,\lambda_{N}$ can be computed explicitly. Note that for each eigenvalue $\lambda_{j}$ we have an eigenvector $v_{j}\in\mathbb{R}^{N}$ (say), the components of which are the (approximated) values of the eigenfunctions (say) $f_{j}$ computed at $y_{1},\ldots,y_{N}$ .

The simulation of eigenvalues was performed in the statistical computing language R, see [19]. As parameters for the simulation we chose $N=1000$ , and we considered the tuning parameters $\beta\in\{0.25,0.5,0.75,1,2,3,5,10\}$ . Each entry in Table 1 stands for the mean of 10 simulation runs.

$\lambda\backslash\beta$	0.25	0.5	0.75	1	2	3	5	10
$\lambda_{1}$	0.00040	0.01065	0.03829	0.07507	0.15207	0.16149	0.13552	0.08791
$\lambda_{2}$	0.00003	0.00304	0.01735	0.04454	0.12921	0.14577	0.12606	0.08178
$\lambda_{3}$	0.00000	0.00021	0.00220	0.00846	0.04894	0.07676	0.08703	0.06879
$\lambda_{4}$	0.00000	0.00004	0.00076	0.00417	0.03966	0.06642	0.07997	0.06459
$\lambda_{5}$	0.00000	0.00000	0.00011	0.00098	0.01692	0.03755	0.05678	0.05518

Table 1: Approximate first five eigenvalues of

\mathcal{K}

for different weight functions

\varphi_{\beta}

, each entry is the mean of 10 simulation runs

6 Alternatives

As in Milos̆ević et al. ([15]), we consider the following close alternatives:

•

a Lehmann alternative with density

$g_{1}(x;\vartheta)=(1+\vartheta)\Phi^{\vartheta}(x)\varphi(x);$
•

a first Ley-Paindaveine alternative with density (see [14])

$g_{2}(x;\vartheta)=\varphi(x){\rm e}^{-\vartheta(1-\Phi(x))}(1+\vartheta\Phi(x));$
•

a second Ley-Paindaveine alternative with density (see [14])

$g_{3}(x;\vartheta)=\varphi(x)(1-\vartheta\pi\cos(\pi\Phi(x)));$
•

a contamination alternative (with $\text{N}(\mu,\sigma^{2})$ for several pairs $(\mu,\sigma^{2})\neq(0,1)$ ) with density

$g_{4}^{[\mu,\sigma^{2}]}(x;\vartheta)=(1-\vartheta)\varphi(x)+\frac{\vartheta}{\sigma}\varphi\left(\frac{x-\mu}{\sigma}\right).$

Like in Milos̆ević et al. ([15]), we computed the local (as $\vartheta\to 0$ ) relative approximate Bahadur efficiencies with respect to the likelihood ratio test (LRT). The LRT is the best test regarding exact Bahadur effiency, and it is often used as a benchmark test. Table 2 displays the local approximate Bahadur efficiencies of $T_{n,\beta}$ with respect to the LRT, for each of the six alternatives considered in [15], and for $\beta\in\{0.25,0.5,0.75,1,2,4,5,10\}$ . A comparison with Table 1 of [15] shows that the Epps–Pulley test with $\beta=0.5$ dominates the Kolmogorov–Smirnov test for each of the six alternatives, and for $\beta=1$ , $\beta=2$ and $\beta=3$ , it outperforms the tests of Cramér–von Mises, the Watson variation of this test, and the Watson–Darling variation of the Kolmogorov–Smirnov test, respectively. If $\beta=0.75$ , the Epps–Pulley test dominates the Anderson–Darling test for each of the alternatives with the exception of the final contamination alternative. As a conclusion, the test of Epps and Pulley should receive more attention as a test for normality.

Alt. $\backslash\beta$	0.25	0.5	0.75	1	2	3	5	10
Lehmann	0.996	0.895	0.854	0.743	0.514	0.406	0.328	0.267
1st Ley-Paindaveine	0.947	0.944	0.998	0.937	0.745	0.612	0.507	0.417
2nd Ley-Paindaveine	0.824	0.872	0.986	0.981	0.881	0.754	0.641	0.533
Contamination with N(1,1)	0.760	0.649	0.592	0.499	0.328	0.255	0.205	0.166
Contamination with N(0.5,1)	0.945	0.824	0.766	0.654	0.438	0.343	0.276	0.224
Contamination with N(0,0.5)	0.084	0.267	0.474	0.587	0.675	0.606	0.526	0.442

Table 2: Approximate local Bahadur efficiency of

T_{n,\beta}

with respect to the LRT

References

[1] R. Bahadur. Rates of convergence of estimates and test statistics. The Annals of Mathematical Statistics, 38(2):303 – 324, 1967.
[2] C. T. H. Baker. The numerical treatment of integral equations. Oxford University Press, 1977.
[3] L. Baringhaus, R. Danschke, and N. Henze. Recent and classical tests for normality - a comparative study. Communications in Statistics - Simulation and Computation, 18:363–379, 1989.
[4] L. Baringhaus, B. Ebner, and N. Henze. The limit distribution of weighted ${L}^{2}$ -goodness-of-fit statistics under fixed alternatives, with applications. Annals of the Institute of Statistical Mathematics, 69(5):969–995, 2017.
[5] L. Baringhaus and N. Henze. A consistent test for multivariate normality based on the empirical characteristic function. Metrika, 35(1):339–348, Dec 1988.
[6] S. Csörgő. Consistency of some tests for multivariate normality. Metrika, 36:107–116, 1989.
[7] B. Ebner and N. Henze. Tests for multivariate normality—a critical review with emphasis on weighted ${L}^{2}$ -statistics. TEST, 29(4):845–892, Dec 2020.
[8] B. Ebner, N. Henze, and Y. Y. Nikitin. Integral distribution-free statistics of $l_{p}$ -type and their asymptotic comparison. Computational Statistics & Data Analysis, 53(9):3426–3438, 2009.
[9] T. W. Epps and L. B. Pulley. A test for normality based on the empirical characteristic function. Biometrika, 70(3):723–726, 1983.
[10] N. Henze and Y. Y. Nikitin. A new approach to goodness-of-fit testing based on the integrated empirical process. Journal of Nonparametric Statistics, 12(3):391–416, 2000.
[11] N. Henze and Y. Y. Nikitin. Watson-type goodness-of-fit tests based on the integrated empirical process. Mathematical Methods of Statistics, 11(2):183–202, 2002.
[12] N. Henze and T. Wagner. A new approach to the bhep tests for multivariate normality. Journal of Multivariate Analysis, 62(1):1–23, 1997.
[13] V. Koltchinskii and E. Giné. Random matrix approximation of spectra of integral operators. Bernoulli, 6(1):113–167, 2000.
[14] C. Ley and D. Paindaveine. Le cam optimal tests for symmetry against ferreira and steel’s general skewed distributions. Journal of Nonparametric Statistics, 21(8):943–967, 2009.
[15] B. Milos̆ević, Y. Y. Nikitin, and M. Obradović. Bahadur efficiency of edf based normality tests when parameters are estimated. ArXiv e-prints, arXiv:2106.07437, 2021.
[16] Y. Nikitin. Asymptotic Efficiency of Nonparametric Tests. Cambridge University Press, 1995.
[17] Y. Y. Nikitin and I. Peaucelle. Efficiency and local optimality of nonparametric tests based on $u$ - and $v$ -statistics. METRON, 62(2):185–200, 2004.
[18] Y. Y. Nikitin and A. V. Tchirina. Lilliefors test for exponentiality: Large deviations, asymptotic efficiency, and conditions of local optimality. Mathematical Methods of Statistics, 16(1):16–24, 2007.
[19] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2021.
[20] C. Rasmussen and K. Christopher. Gaussian Processes for Machine Learning. MIT Press, 2006.
[21] V. M. Zolotarev. Concerning a certain probability problem. Theory of Probatility and its Applications, 6(2), 1961.

B. Ebner and N. Henze,
Institute of Stochastics,
Karlsruhe Institute of Technology (KIT),
Englerstr. 2, D-76133 Karlsruhe.
E-mail: Bruno.Ebner@kit.edu
E-mail: Norbert.Henze@kit.edu