An approximation to peak detection power using Gaussian random field theory

Yu Zhao¹ Dan Cheng³ Armin Schwartzman^1,2

(¹Division of Biostatistics,
Herbert Wertheim School of Public Health and Human Longevity Science,
University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
²Halicioǧlu Data Science Institute,
University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
³School of Mathematical and Statistical Sciences,
Arizona State University, 900 S. Palm Walk, Tempe, AZ 85281, USA
)

Abstract

We study power approximation formulas for peak detection using Gaussian random field theory. The approximation, based on the expected number of local maxima above the threshold $u$ , ${\mathbb{E}}[M_{u}]$ , is proved to work well under three asymptotic scenarios: small domain, large threshold, and sharp signal. An adjusted version of ${\mathbb{E}}[M_{u}]$ is also proposed to improve accuracy when the expected number of local maxima ${\mathbb{E}}[M_{-\infty}]$ exceeds 1.

Cheng and Schwartzman (2018) developed explicit formulas for ${\mathbb{E}}[M_{u}]$ of smooth isotropic Gaussian random fields with zero mean. In this paper, these formulas are extended to allow for rotational symmetric mean functions, so that they are suitable for power calculations. We also apply our formulas to 2D and 3D simulated datasets, and the 3D data is induced by a group analysis of fMRI data from the Human Connectome Project to measure performance in a realistic setting.

Key words: Power calculations, peak detection, Gaussian random field, image analysis.

1 Introduction

Detection of peaks (local maxima) is an important topic in image analysis. For example, a fundamental goal in fMRI analysis is to identify the local hotspots of brain activity (see, for example, Genovese et al., 2002 and Heller et al., 2006), which are typically captured by peaks in the fMRI signal. The detection of such peaks can be posed as a statistical testing problem intended to test whether the underlying signal has a peak at a given location. This is challenging because such tests are conducted only at locations of observed peaks, which depend on the data. Therefore, the height distribution of the observed peak is conditional on a peak being observed at that location. This is a nonstandard problem. Solutions exist using random field theory (RFT). RFT is a statistical framework that can be used to perform topological inference and modeling. RFT-based peak detection has been studied in Cheng and Schwartzman (2017) and Schwartzman and Telschow (2019), which provide the peak height distribution for isotropic noise under the complete null hypothesis of no signal anywhere.

In general, for any statistical testing problem, accurate power calculations help researchers decide the minimum sample size required for an informative test, and thus reduce cost. Power calculation formulas exist for common univariate tests, such as z-tests and t-tests. However, particular challenges arise when we perform power calculations in peak detection settings. Due to the nature of imaging data, the number and location of the signal peaks are unknown. Besides, the power is affected by other spatial aspects of the problem, such as the shape of the peak function and the spatial autocorrelation of the noise. Considering these difficulties, it requires some extra effort to derive a power formula for peak detection.

A formal definition of power in peak detection is necessary to perform power calculations. In Cheng and Schwartzman (2017) and Durnez et al. (2016), the authors explored approaches to control the false discovery rate (FDR). For the entire domain, average peakwise power, i.e. power averaged over all non-null voxels, is a natural choice for these approaches. For a local domain where a single peak exists, the power can be defined as the probability of successfully detecting that peak. Following this idea, we describe the null and alternative hypothesis and the definition of detection power. We do so informally here for didactic purposes and present formal definitions in Section 2.

Consider a local domain where a single peak may exist, and consider the hypotheses

	$\displaystyle H_{0}$	$\displaystyle:\text{``the signal is equal to 0 in the local domain." \quad vs}$
	$\displaystyle H_{1}$	$\displaystyle:\text{``the signal has at least one positive peak in the local domain."}$

Suppose we observe a random field to be used as test statistic at every location, typically as the result of statistical modeling of the data. For a fixed threshold $u$ , the existence of observed peaks with height greater than $u$ would lead to rejecting the null hypothesis. Therefore, we define the type I error and power as the probability of existing at least one local maximum above $u$ under $H_{0}$ and $H_{1}$ respectively:

	Type I error:	$\displaystyle\mathbb{P}\{\exists\text{ a peak in the local domain with height}>u\text{ when $H_{0}$ is true}\}$
	Power:	$\displaystyle\mathbb{P}\{\exists\text{ a peak in the local domain with height}>u\text{ when $H_{1}$ is true}\}$		(1)

Formulas for type I error have been developed for stationary fields in 1D and isotropic fields in 2D and 3D (Cheng and Schwartzman, 2015, Cheng and Schwartzman, 2017 and Cheng and Schwartzman, 2018). However, there is no formula to calculate power. In order to get an appropriate estimate of power, we need to know the peak height distribution for non-centered (the mean function is not 0) random fields. Generally speaking, it is very difficult to calculate the peak height distribution especially when the random field has non-zero mean. Durnez et al. (2016) suggests using Gaussian distribution to describe the non-null peaks and truncated Gaussian distribution to approximate the overshoot distribution. This approach is easy to implement but not very accurate because the peak height distribution is in reality always skewed and not close to any Gaussian distribution.

In this article, we propose to approximate the probability of an observed peak exceeding the detection threshold $u$ by calculating the expected number of peaks above the threshold $u$ . We show that the approximation, which is also an upper bound, works well under certain scenarios. For the entire domain, we can approximate the average peakwise power by taking the arithmetic mean of the approximation proposed in this paper over non-null voxels.

The proposed approximation makes the problem more tractable, but in general, it does not have an explicit form. In order to make it applicable in practice, we further simplify the formula under the isotropy assumption and show its explicit form in 1D, 2D, and 3D. The explicit results are validated through 2D and 3D computer simulations carried out in MATLAB. The simulation also covers multiple scenarios by modifying the parameters used to generate the data. The performance of power approximation and its conservative adjustment under these scenarios are discussed.

Finally, to assess the real-data performance of our power approximation method, we apply it to a 3D simulation induced by a real brain imaging dataset, where the parameters are estimated from the Human Connectome Project (Van Essen et al., 2012) fMRI data. By testing the method in a realistic setting, we also demonstrate how effect size and other parameters affect the power.

The paper is organized as follows. We first show in Section 2 the problem setup and theoretical results in certain scenarios. In Section 3, we derive the explicit formulas under isotropy. Simulation in 2D is conducted in Section 4. Details regarding how to apply our formula in application setting are discussed in Section 5. The methodology is applied to a 3D real dataset in Section 6.

2 Power Approximation

2.1 Setup

Let $Y(s)=\sigma(s)Z(s)+\mu(s)$ where $Z=\{Z(s),s\in D\}$ representing the noise is a centered (zero-mean) smooth unit-variance Gaussian random field on an $N$ dimensional non-empty domain $D\subset\mathbb{R}^{N}$ , $\sigma(s)$ is the standard deviation of the noise and $\mu(s)$ is the mean function. Let $X(s)=Y(s)/\sigma(s)=Z(s)+\theta(s)$ where the ratio $\theta(s)=\mu(s)/\sigma(s)$ is the standardized mean function, which we assume to be $C^{2}$ . Here $C^{3}$ is a sufficient smoothness condition for $Z$ , and this will be clarified in Assumption 1 below.

Let

\begin{split}X_{i}(s)&=\frac{\partial X(s)}{\partial s_{i}},\quad\nabla X(s)=(X_{1}(s),\ldots,X_{N}(s)),\\ X_{ij}(s)&=\frac{\partial^{2}X(s)}{\partial s_{i}s_{j}},\quad\nabla^{2}X(s)=(X_{ij}(s))_{1\leq i,j\leq N},\\ Z_{i}(s)&=\frac{\partial Z(s)}{\partial s_{i}},\quad\nabla Z(s)=(Z_{1}(s),\ldots,Z_{N}(s)),\\ Z_{ij}(s)&=\frac{\partial^{2}Z(s)}{\partial s_{i}s_{j}},\quad\nabla^{2}Z(s)=(Z_{ij}(s))_{1\leq i,j\leq N}.\end{split}

We will make use of the following assumptions:

Assumption 1.

$Z\in C^{2}(D)$ almost surely and its second derivatives satisfy the mean-square Hölder condition: for any $s_{0}\in D$ , there exists positive constants $L$ , $\eta$ and $\delta$ such that

\mathbb{E}(Z_{ij}(s)-Z_{ij}(t))^{2}\leq L^{2}\|s-t\|^{2\eta},\quad\forall t,s\in U_{s_{0}}(\delta),\>i,j=1,...,N.

where $U_{s_{0}}(\delta)=s_{0}\oplus(-\delta/2,\delta/2)^{N}$ is the $N$ dimensional open cube of side length $\delta$ centered at $s_{0}$ . This condition is satisfied, for example, if $Z$ is $C^{3}(D)$ .

Assumption 2.

For every pair $(t,s)\in D\times D$ with $s\neq t$ , the Gaussian random vector

(Z(s),\,\nabla Z(s),\,Z_{ij}(s),\,Z(t),\,\nabla Z(t),\,Z_{ij}(t),\,1\leq i\leq j\leq N)

is non-degenerate, i.e. its covariance matrix has full rank.

2.2 Peak detection

Following the notation in the problem setup, the null and alternative hypothesis can be written as:

	$\displaystyle H_{0}$	$\displaystyle:\text{$\mu(s)=0\>\text{for all}\>s\in D$ \quad vs}$
	$\displaystyle H_{1}$	$\displaystyle:\text{$\mu(s)>0,\>\nabla\mu(s)=0\text{, }\nabla^{2}\mu(s)\prec 0\>\text{for some }s\in D$}$

The mean function $\mu(s)$ is not directly observed, so the hypothesis is tested based on the peak height of $X(s)$ . For a peak detection procedure that aims to test this hypothesis, a threshold $u$ for the peak height of $X(s)$ needs to be set in advance. If a local maximum with height greater than $u$ is observed, we would choose to reject the null hypothesis due to the strong evidence against it. The probability that a peak of $X$ exceeds $u$

{\mathbb{P}}{\left(\exists\>s\in D\>s.t.\>X(s)>u|\nabla X(s)=0\text{ and }\nabla^{2}X(s)\prec 0\right)}

(2)

is the type I error under $H_{0}$ and power under $H_{1}$ . The threshold $u$ can be obtained based on the peak height distribution under $H_{0}$ . A formula for peak height distribution of smooth isotropic Gaussian random fields has been derived in Cheng and Schwartzman (2018) and it can also be derived directly from a special case of the formulas presented in this paper. Usually, $u$ is set to be some quantile of the null distribution of peak height to maintain the nominal $\alpha$ type I error. More details about selecting the threshold will be discussed in the real data example. Selecting $u$ is not the main focus of this paper and our method can be applied to any choice of $u$ .

2.3 Power approximation

Let $M_{u}$ be the number of local maxima of the random field $X$ above $u$ over the local domain $D$ . The power defined in (2) can be represented as ${\mathbb{P}}[M_{u}\geq 1]$ . We call this the power function, seen as a function of the threshold u. Note that

{\mathbb{P}}[M_{u}\geq 1]=\sum_{k=1}^{\infty}{\mathbb{P}}[M_{u}=k]\leq\sum_{k=1}^{\infty}k{\mathbb{P}}[M_{u}=k]={\mathbb{E}}[M_{u}].

(3)

On the other hand,

{\mathbb{E}}[M_{u}]-{\mathbb{P}}[M_{u}\geq 1]=\sum_{k=2}^{\infty}(k-1){\mathbb{P}}[M_{u}=k]\leq\frac{1}{2}\sum_{k=2}^{\infty}k(k-1){\mathbb{P}}[M_{u}=k]=\frac{1}{2}{\mathbb{E}}[M_{u}(M_{u}-1)].

(4)

Thus, we have

{\mathbb{E}}[M_{u}]-\frac{1}{2}{\mathbb{E}}[M_{u}(M_{u}-1)]\leq{\mathbb{P}}[M_{u}\geq 1]\leq{\mathbb{E}}[M_{u}].

(5)

This inequality tells us that for any fixed $u$ , the power is bounded within an interval of length ${\mathbb{E}}[M_{u}(M_{u}-1)]/2$ . Thus, ${\mathbb{E}}{[M_{u}]}$ is a good approximation of power if one of the two conditions below is satisfied:

1.

The factorial moment ${\mathbb{E}}{[M_{u}(M_{u}-1)]}$ converges to 0 and ${\mathbb{E}}{[M_{u}]}$ does not.
2.

They both converge to 0 and ${\mathbb{E}}{[M_{u}(M_{u}-1)]}$ converges faster than ${\mathbb{E}}{[M_{u}]}$ .

The convergence above refers to conditions on the signal and noise parameters. In the rest of this section, we introduce four interesting results. The first result can be useful for simplifying the power function and the other three results give different scenarios where one of the conditions above holds.

2.4 Adjusted ${\mathbb{E}}[M_{u}]$

We have provided evidence of using ${\mathbb{E}}[M_{u}]$ to approximate power through (5). However, ${\mathbb{E}}[M_{u}]$ alone might not be sufficient for power approximation since it only gives an upper bound. Also, unlike power, ${\mathbb{E}}[M_{u}]$ sometimes exceeds 1. To correct for this, we define the adjusted ${\mathbb{E}}[M_{u}]$ as

{\mathbb{E}}[M_{u}]_{{\rm adj}}={\mathbb{E}}[M_{u}]/\max(1,{\mathbb{E}}[M_{-\infty}]).

(6)

The adjusted ${\mathbb{E}}[M_{u}]$ is the same as ${\mathbb{E}}[M_{u}]$ when the expected number of local maxima ${\mathbb{E}}[M_{-\infty}]$ is less or equal to 1. When ${\mathbb{E}}[M_{-\infty}]$ is greater than 1, we divide ${\mathbb{E}}[M_{u}]$ by ${\mathbb{E}}[M_{-\infty}]$ to make sure it never exceeds 1. The adjusted ${\mathbb{E}}[M_{u}]$ is more conservative, and we conjecture that it is a lower bound of power when there exists at least one local maximum in the domain $D$ . In applications, people are interested in a conservative estimator so that the test is guaranteed to have enough power. Combining ${\mathbb{E}}[M_{u}]$ and ${\mathbb{E}}[M_{u}]/{\mathbb{E}}[M_{-\infty}]$ , we can get an approximate range of the true power. We will compare ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ in simulation studies.

2.5 Height equivariance

Our first result does not concern the approximation (5) yet, but it offers a simplification of the power function and ${\mathbb{E}}[M_{u}]$ that will be used later. The proposition below states that the power function and ${\mathbb{E}}[M_{u}]$ for peak detection are translation equivariant with respect to peak height.

Proposition 1.

Let $\theta(s)=h(s)+\theta_{0}$ be a peak signal with height $\theta_{0}$ , where $h(s)$ is a unimodal mean function with maximum equal to 0 at $s_{0}$ in $D$ . Then the power function for peak detection and ${\mathbb{E}}[M_{u}]$ can be written in the form $F(u-\theta_{0})$ , where $F(u)$ is the power function or ${\mathbb{E}}[M_{u}]$ at $\theta_{0}=0$ .

Proof.

Let $\tilde{\theta}(s)=\theta(s)-\theta_{0}=h(s)+0$ and $\tilde{M}_{u}$ be the number of local maxima of the random field $\tilde{X}(s)=Z(s)+\tilde{\theta}(s)$ above $u$ over $D$ . Considering the definition of power, we have

F(u-\theta_{0})={\mathbb{P}}[\tilde{M}_{u-\theta_{0}}\geq 1]={\mathbb{P}}[M_{u}\geq 1].

Given that ${\mathbb{E}}[\tilde{M}_{u-\theta_{0}}]={\mathbb{E}}[M_{u}]$ , is is also straightforward to show ${\mathbb{E}}[M_{u}]$ is translation equivariant with respect to $\theta_{0}$ . ∎

Next, we give three scenarios where the equality in (5) can be achieved asymptotically: small domain size, large threshold, and sharp signal.

2.6 Small domain

If the size of the local domain $D$ where a single peak exists is small enough, it can be shown that equality in (5) can be achieved asymptotically.

Theorem 1.

Consider a local domain $D_{\epsilon}=U(s_{0},\epsilon)$ for any fixed $s_{0}\in D$ where $U(s_{0},\epsilon)=t_{0}\oplus(-\epsilon/2,\epsilon/2)^{N}$ is the N-dimensional open cube of side $\epsilon$ centered at $t_{0}$ . For sufficiently small $\epsilon$ and fixed threshold u,

{\mathbb{P}}[M_{u}\geq 1]={\mathbb{E}}[M_{u}](1-o(1))={\mathbb{E}}[M_{u}]_{{\rm adj}}(1-o(1)),

(7)

Proof.

The proof is based on the proof of Lemma 3 in Piterbarg (1996) and Lemma 4.1 in Cheng and Schwartzman (2015).

\begin{split}{\mathbb{E}}[M_{u}(M_{u}-1)]=\int_{D_{\epsilon}}\int_{D_{\epsilon}}\int_{u}^{\infty}\int_{u}^{\infty}{\mathbb{E}}\left[|{\rm det}\nabla^{2}X(s)||{\rm det}\nabla^{2}X(t)|\middle|\begin{matrix}X(s)=x_{1},X(t)=x_{2}\\ \nabla X(s)=\nabla X(t)=0\end{matrix}\right]\\ P_{X(s),X(t),\nabla X(s),\nabla X(t)}(x_{1},x_{2},0,0)\,dx_{1}\,dx_{2}\,ds\,dt.\end{split}

(8)

Let

{\mathbb{E}}_{1}(s,t)={\mathbb{E}}\left[|{\rm det}\nabla^{2}X(s)||{\rm det}\nabla^{2}X(t)|\middle|\begin{matrix}X(s)=x\\ \nabla X(s)=\nabla X(t)=0\end{matrix}\right]

and replace one of the integration limits in (8) by $-\infty$ , we have

\begin{split}{\mathbb{E}}[M_{u}(M_{u}-1)]\leq\int_{D_{\epsilon}}\int_{D_{\epsilon}}P_{\nabla X(s),\nabla X(t)}(0,0)dsdt\int_{u}^{\infty}{\mathbb{E}}_{1}(s,t)P_{X(s)}\left(x\middle|\nabla X(s)=\nabla X(t)=0\right)dx.\end{split}

Then we can take the Taylor expansion

\nabla X(t)=\nabla X(s)+\nabla^{2}X(s)(t-s)+||t-s||^{1+\alpha}Y_{s,t}

where $Y_{s,t}=(Y_{s,t}^{1},...,Y_{s,t}^{N})^{T}$ is a Gaussian vector field. Note that the determinant of $\nabla^{2}X(s)$ is equal to the determinant of

\begin{pmatrix}1&-(t_{1}-s_{1})&\ldots&-(t_{N}-s_{N})\\ 0&&&\\ \vdots&&\nabla^{2}X(s)&\\ 0&&&\end{pmatrix}

(9)

For $i=2,...,N+1$ , multiply the $i$ th column of this matrix by $(t_{i}-s_{i})/||t_{i}-s_{i}||^{2}$ , take the sum of all such columns and add the result to the first column. Since $\nabla X(s)=\nabla X(t)=0$ , we can derive $\nabla^{2}X(s)(t-s)=-||t-s||^{1+\alpha}Y_{s,t}$ , and obtain the matrix below with the same determinant as (9)

\begin{pmatrix}0&-(t_{1}-s_{1})&\ldots&-(t_{N}-s_{N})\\ -||t-s||^{-1+\alpha}Y_{s,t}^{1}&&&\\ \vdots&&\nabla^{2}X(s)&\\ -||s-t||^{-1+\alpha}Y_{s,t}^{N}&&&\end{pmatrix}

Let $r=\max_{1\leq i\leq N}|t_{i}-s_{i}|$ ,

A_{s,t}=\begin{pmatrix}0&-(t_{1}-s_{1})/r&\ldots&-(t_{N}-s_{N})/r\\ Y_{s,t}^{1}&&&\\ \vdots&&\nabla^{2}X(s)&\\ Y_{s,t}^{N}&&&\end{pmatrix}

So we have

{\mathbb{E}}_{1}(s,t)\leq||t-s||^{\alpha}{\mathbb{E}}_{2}(s,t)

where

{\mathbb{E}}_{2}(s,t)={\mathbb{E}}\left[|{\rm det}A_{s,t}||{\rm det}\nabla^{2}X(t)|\middle|\begin{matrix}X(s)=x,\nabla X(s)=0\\ \nabla^{2}X(s)(t-s)=-||t-s||^{1+\alpha}Y_{s,t}\end{matrix}\right].

Using the inequality of arithmetic and geometric means, we can bound the determinant

\begin{split}|{\rm det}\nabla^{2}X(t)|\leq N^{2N-2}\sum_{i,j}|X_{ij}(t)|^{N}\\ |{\rm det}A_{s,t}|\leq(N+1)^{2N}\sum_{i,j}|a_{ij}|^{N+1}\end{split}

where $a_{ij}$ is the $i,j$ entry of $A_{s,t}$ . Apply the inequality again

|{\rm det}\nabla^{2}X(t)||{\rm det}A_{s,t}|\leq\frac{1}{2}N^{2N-2}(N+1)^{2N+1}\left(\sum_{i,j}|X_{ij}(t)|^{2N}+\sum_{i,j}|a_{ij}|^{2N+2}\right).

For any Gaussian variable $X$ and integer $N\geq 0$ , the following inequality holds

{\mathbb{E}}[X^{2N}]\leq 2^{2N}({\mathbb{E}}[X]^{2N}+C_{N}{\rm Var}(X)^{2N})

where $C_{N}$ is a constant depending on $N$ . Next, we can focus on the conditional expectation and conditional variance of $X_{ij}(t)$ and $Y_{s,t}$ .

By Assumption 1 and 2 and the fact that the conditional variance of a Gaussian variable is less or equal to the unconditional variance, we can conclude that the conditional variance of $X_{ij}(t)$ and $Y_{s,t}$ are bounded above by some constant.

Summarizing the results above,

\sup_{s,t\in D_{\epsilon},s\neq t}|{\mathbb{E}}_{2}(s,t)|\leq C_{1}

for some constant $C_{1}>0$ and

{\mathbb{E}}_{1}(s,t)\leq||t-s||^{\alpha}{\mathbb{E}}_{2}(s,t)\leq C_{1}||t-s||^{\alpha}.

Combine the results above and with a fixed threshold $u$

	$\displaystyle\int_{u}^{\infty}{\mathbb{E}}_{1}(s,t)P_{X(s)}\left(x\middle\|\nabla X(s)=\nabla X(t)=0\right)dx$
	$\displaystyle\leq C_{1}\|\|t-s\|\|^{\alpha}\int_{u}^{\infty}P_{X(s)}\left(x\middle\|\nabla X(s)=\nabla X(t)=0\right)dx$
	$\displaystyle=C_{1}\|\|t-s\|\|^{\alpha}\int_{u}^{\infty}\exp(-(Ax-B)^{2})dx\quad\text{for some constant $A$, $B$}$
	$\displaystyle=C_{2}\|\|t-s\|\|^{\alpha}$

for some constant $C_{2}>0$ .

Next, by the proof of Lemma 4.1 in Cheng and Schwartzman (2015)

p_{\nabla X(s),\nabla X(t)}(0,0)\leq C_{3}||t-s||^{-N}

for some constant $C_{3}>0$ .

Therefore, there exists $C_{4}$ such that

\begin{split}{\mathbb{E}}[M_{u}(M_{u}-1)]\leq C_{4}\int_{D_{\epsilon}}\int_{D_{\epsilon}}\frac{1}{||t-s||^{N-\alpha}}dtds=o(\epsilon^{N}).\end{split}

For ${\mathbb{E}}[M_{u}]$ , by Kac-Rice formula in Adler and Taylor (2007)

{\mathbb{E}}[M_{u}]=\int_{D_{\epsilon}}p_{\nabla X(s)}(0){\mathbb{E}}\left[|{\rm det}\nabla^{2}X(s)|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}|\nabla X(s)=0\right]ds.

Denote the integrand by $g(s)$ . The function $g(s)$ is continuous and positive over the compact domain $D_{\epsilon}$ . Thus $\inf_{s\in D_{\epsilon}}g(s)\geq g_{0}>0$ , implying

{\mathbb{E}}[M_{u}]\geq g_{0}\epsilon^{N}.

Then (7) is an immediate consequence of (5).

For ${\mathbb{E}}[M_{-\infty}]$ , by Kac-Rice formula

{\mathbb{E}}[M_{-\infty}]=\int_{D_{\epsilon}}p_{\nabla X(s)}(0){\mathbb{E}}\left[|{\rm det}\nabla^{2}X(s)|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}|\nabla X(s)=0\right]ds.

The integrand is also continuous and positive over the compact domain $D_{\epsilon}$ indicating ${\mathbb{E}}[M_{-\infty}]=o(1)$ for small $\epsilon$ . Thus we have

{\mathbb{E}}[M_{u}]_{{\rm adj}}={\mathbb{E}}[M_{u}]/\max(1,{\mathbb{E}}[M_{-\infty}])={\mathbb{E}}[M_{u}]/\max(1,o(1))={\mathbb{E}}[M_{u}]

for sufficiently small $\epsilon$ . ∎

2.7 Large threshold

For large threshold $u$ , the following asymptotic result shows power can be precisely approximated by ${\mathbb{E}}[M_{u}]$ .

Theorem 2.

For any fixed domain $D$ , as $u\to\infty$

{\mathbb{P}}[M_{u}\geq 1]={\mathbb{E}}[M_{u}](1-o(e^{-\alpha u^{2}}))

(10)

where the error term $o(e^{-\alpha u^{2}})$ is non-negative and $\alpha>0$ is some constant.

Proof.

By lemma 3 of Piterbarg (1996), as $u\to\infty$ , the factorial moment is super-exponentially small. That means $\exists\alpha>0$ s.t.

{\mathbb{E}}[M_{u}(M_{u}-1)]=o(e^{-\frac{u^{2}}{2}-\alpha u^{2}}).

Also

E[M_{u}]\geq{\mathbb{P}}[M_{u}\geq 1]\geq{\mathbb{P}}[\text{sup}X(s)\geq u]=O(e^{-\frac{u^{2}}{2}}).

Thus, the factorial moment decays exponentially faster than ${\mathbb{E}}[M_{u}]$ . The result is an immediate consequence of (5). ∎

Notice that the threshold $u$ does not affect the value of ${\mathbb{E}}[M_{-\infty}]$ which is part of the adjusted ${\mathbb{E}}[M_{u}]$ . By (10)

{\mathbb{P}}[M_{u}\geq 1]={\mathbb{E}}[M_{u}]_{{\rm adj}}(1-o(e^{-\alpha u^{2}}))\max(1,{\mathbb{E}}[M_{-\infty}]).

If ${\mathbb{E}}[M_{-\infty}]>1$ , the adjusted ${\mathbb{E}}[M_{u}]$ might be overly conservative for large threshold $u$ . Therefore, we only recommend ${\mathbb{E}}[M_{u}]$ for this scenario.

2.8 Sharp signal

The following theorem provides an asymptotic power approximation when the signal is sharp. Interestingly, while the power function is generally non-Gaussian, it becomes closer to Gaussian as the signal peaks become sharper.

Theorem 3.

Let $\theta(s)=ah(s)+\theta_{0}$ where $h(s)$ is a unimodal mean function with maximum equal to 0 at $s_{0}$ , $a>0$ , and $\theta_{0}$ represents the height. For any fixed threshold $u$ , as $a\to\infty$

{\mathbb{P}}[M_{u}\geq 1]={\mathbb{E}}[M_{u}]+o(1)={\mathbb{E}}[M_{u}]_{{\rm adj}}+o(1)=\Phi(\theta_{0}-u)(1+o(1)),

(11)

where $\Phi(x)$ is CDF of the standard Gaussian distribution.

Proof.

By lemma A.1 of Cheng and Schwartzman (2017), as $a\to\infty$

{\mathbb{P}}(M_{-\infty}=1)\geq 1-O(\exp(-ca^{2})),

where $c>0$ is some constant. Therefore $M_{-\infty}\overset{p}{\to}1$ .

Since $M_{u}\leq M_{-\infty}$ and both of them only take non-negative integer values, $|M_{u}(M_{u}-1)|$ and $|M_{-\infty}(M_{-\infty}-1)|$ are bounded above by $|M(M-1)|$ where $M$ is the number of critical points of the random field $X$ . Apply Kac-Rice formula

\begin{split}{\mathbb{E}}[M(M-1)]=\int_{D}\int_{D}{\mathbb{E}}\left[|{\rm det}\nabla^{2}X(s)||{\rm det}\nabla^{2}X(t)|\middle|\nabla X(s)=\nabla X(t)=0\right]P_{\nabla X(s),\nabla X(t)}(0,0)dsdt.\end{split}

Denote the integrand by $g(s,t,a)$ . The function $g(s,t,a)$ is continuous and positive over the compact domain $D$ and $M(M-1)\overset{p}{\to}0$ as $a\to\infty$ . Thus there exists $g_{0}>0$ such that ${\mathbb{E}}[M(M-1)]\leq g_{0}$ . Then by dominated convergence theorem

{\mathbb{E}}[M_{u}(M_{u}-1)]\to 0

as $a\to\infty$ . Since $M_{-\infty}\overset{p}{\to}1$ , the adjusted ${\mathbb{E}}[M_{u}]$

{\mathbb{E}}[M_{u}]_{{\rm adj}}={\mathbb{E}}[M_{u}]\max(1,{\mathbb{E}}[M_{-\infty}])={\mathbb{E}}[M_{u}](1+o(1))={\mathbb{E}}[M_{u}]+o(1).

To calculate ${\mathbb{E}}[M_{u}]$ , apply Kac-Rice formula

$\displaystyle{\mathbb{E}}[M_{u}]=$	$\displaystyle\int_{D}p_{\nabla X(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
$\displaystyle=$	$\displaystyle\int_{D}p_{\nabla X(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
$\displaystyle=$	$\displaystyle\int_{D}\frac{1}{(2\pi)^{N/2}\sqrt{\det(\Lambda)}}\exp(-a^{2}(\nabla h(s))^{T}\Lambda^{-1}\nabla h(s)/2)$
	$\displaystyle{\mathbb{E}}\left[\|{\rm det}(\nabla^{2}Z(s)+a\nabla^{2}h(s))\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$	(12)

where $\Lambda$ is the covariance matrix of $\nabla h(s)$ . Let $f(s)=(\nabla h(s))^{T}\Lambda^{-1}\nabla h(s)/2$ which attains its minimum $0$ only at $s_{0}$ . Similar to the proof of A.4 in Cheng and Schwartzman (2017), as $a\to\infty$ , (2.8) can be approximated by applying Laplace’s method

	$\displaystyle{\mathbb{E}}[M_{u}]=$	$\displaystyle\frac{\det(a\nabla^{2}h(s_{0}))}{(2\pi)^{N/2}\sqrt{\det(\Lambda)}}\left(\frac{(2\pi)^{N}\det(\Lambda)}{a^{2N}\det(\nabla^{2}h(s_{0}))}\right)^{1/2}\Phi(\theta_{0}-u)+O(a^{-2})$
	$\displaystyle=$	$\displaystyle\Phi(\theta_{0}-u)+O(a^{-2}).$

This finishes the proof.

∎

3 Explicit formulas

We have showed that the power for peak detection can be approximated by the expected number of local maxima above $u$ , ${\mathbb{E}}[M_{u}]$ , under certain scenarios such as small domain and large threshold. Although we can apply the Kac-Rice formula to calculate ${\mathbb{E}}[M_{u}]$ , it remains difficult to evaluate it explicitly for $N>1$ without making any further assumptions. In this section, we focus on computing ${\mathbb{E}}[M_{u}]$ and show a general formula can be obtained if the noise field is isotropic. Furthermore, explicit formulas when $N=1,2,3$ are derived for application purposes.

3.1 Isotropic Gaussian fields

Suppose $Z$ is a zero-mean unit-variance isotropic random field. We can write the covariance function of $Z$ as ${\mathbb{E}}\{Z(s)Z(t)\}=\rho(\|s-t\|^{2})$ for an appropriate function $\rho(\cdot):[0,\infty)\rightarrow\mathbb{R}$ . Denote

\rho^{\prime}=\rho^{\prime}(0),\quad\rho^{\prime\prime}=\rho^{\prime\prime}(0),\quad\kappa=-\rho^{\prime}/\sqrt{\rho^{\prime\prime}}.

(13)

where $\rho^{\prime}$ and $\rho^{\prime\prime}$ are first and second derivative of function $\rho$ respectively.

The following lemma comes from Cheng and Schwartzman (2018).

Lemma 1.

For each $s\in\mathbb{R}^{N}$ and $i$ , $j$ , $k$ , $l\in\{1,\ldots,N\}$ ,

\begin{split}{\mathbb{E}}\{Z_{i}(s)Z(s)\}&={\mathbb{E}}\{Z_{i}(s)Z_{jk}(s)\}=0,\\ {\mathbb{E}}\{Z_{i}(s)Z_{j}(s)\}&=-{\mathbb{E}}\{Z_{ij}(s)Z(s)\}=-2\rho^{\prime}\delta_{ij},\\ {\mathbb{E}}\{Z_{ij}(s)Z_{kl}(s)\}&=4\rho^{\prime\prime}(\delta_{ij}\delta_{kl}+\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})\end{split}

where $\rho^{\prime}$ and $\rho^{\prime\prime}$ are defined in (13) and $\delta_{ij}$ is the Kronecker delta function.

In particular, it follows from Lemma 1 that ${\rm Var}(Z_{i}(s))=-2\rho^{\prime}$ and ${\rm Var}(Z_{ii}(s))=12\rho^{\prime\prime}$ for any $i\in\{1,\ldots,N\}$ , implying $\rho^{\prime}<0$ and $\rho^{\prime\prime}>0$ and hence $\kappa>0$ .

We can use theoretical results from Gaussian Orthogonally Invariant (GOI) matrices to make the calculation of ${\mathbb{E}}{[M_{u}]}$ easier. GOI matrices were first introduced in Schwartzman et al. (2008), and used for the first time in the context of random fields in Cheng and Schwartzman (2018). It is a class of Gaussian random matrices that are invariant under orthogonal transformations, and can be useful for computing the expected number of critical points of isotropic Gaussian fields. We call an $N\times N$ random matrix $G=(G_{ij})_{1\leq i,j\leq N}$ GOI with covariance parameter $c$ , denoted by ${\rm GOI}(c)$ , if it is symmetric and all entries are centered Gaussian variables such that

{\mathbb{E}}[G_{ij}G_{kl}]=\frac{1}{2}(\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})+c\delta_{ij}\delta_{kl}.

(14)

The following lemma is Lemma 3.4 from Cheng and Schwartzman (2018).

Lemma 2.

Let the assumptions in Lemma 1 hold. Let $\widetilde{G}$ and $G$ be ${\rm GOI}(1/2)$ and ${\rm GOI}((1-\kappa^{2})/2)$ matrices respectively. $I_{N}$ denotes $N\times N$ identity matrix.

(i) The distribution of $\nabla^{2}Z(s)$ is the same as that of $\sqrt{8\rho^{\prime\prime}}\widetilde{G}$ .

(ii) The distribution of $(\nabla^{2}Z(s)|Z(s)=z)$ is the same as that of $\sqrt{8\rho^{\prime\prime}}\big{[}G-\big{(}\kappa z/\sqrt{2}\big{)}I_{N}\big{]}$ .

Lemma 2 shows the distribution and conditional distribution of the Hessian matrix of a centered random field $Z(s)$ . Next, we establish the corresponding result for non-centered random field $X(s)=Z(s)+\theta(s)$ .

Lemma 3.

Let $\widetilde{G}$ and $G$ be ${\rm GOI}(1/2)$ and ${\rm GOI}((1-\kappa^{2})/2)$ matrices respectively.

(i) The distribution of $\nabla^{2}X(s)$ is the same as that of

\sqrt{8\rho^{\prime\prime}}\widetilde{G}+\nabla^{2}\theta(s).

(ii) The distribution of $(\nabla^{2}X(s)|X(s)=x)$ is the same as that of

\sqrt{8\rho^{\prime\prime}}\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+\nabla^{2}\theta(s).

Proof.

Part (i) is a direct consequence of Lemma 2. For part (ii), note that $(\nabla^{2}X(s)|X(s)=x)$ is equivalent to $(\nabla^{2}Z(s)|Z(s)=x-\theta(s))+\nabla^{2}\theta(s)$ , and the result follows immediately from Lemma 2. ∎

3.2 General formula under isotropy

Theorem 4.

Let $X(s)=Z(s)+\theta(s)$ , where $Z(s)$ is a smooth zero-mean unit-variance isotropic Gaussian random field satisfying Assumption 1, 2. Let $\theta(s)$ a smooth $C^{3}$ mean function such that $\nabla^{2}\theta(s)$ is a non-singular matrix with ordered eigenvalues $\theta^{\prime\prime}_{1}(s)...\theta^{\prime\prime}_{N}(s)$ at all critical points $s$ . Then for any domain $D$

{\mathbb{E}}[M_{u}]=\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}\int_{D}e^{\frac{\|\nabla\theta(s)\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}{}\right){\mathbb{E}}\left[|{\rm det}({\rm Matrix}(s))|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds

(15)

where $\phi(x)$ is the PDF of the standard Gaussian distribution, ${\rm Matrix}(s)=G-\kappa(x-\theta(s))I_{N}/\sqrt{2}+\\$ ${\rm diag}\{\theta^{\prime\prime}_{1}(s),\dots,\theta^{\prime\prime}_{N}(s)\}/\sqrt{8\rho^{\prime\prime}}$ , $G$ as in Lemma 3 represents GOI((1- $\kappa^{2}$ )/2), and $\mathbbm{1_{\{\cdot\}}}$ denotes the indicator function.

Proof.

By the Kac-Rice formula

	$\displaystyle{\mathbb{E}}[M_{u}]$	$\displaystyle=\int_{D}p_{\nabla X(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}p_{\nabla Z(s)+\nabla\theta(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}\frac{1}{(2\pi)^{N/2}(-2\rho^{\prime})^{N/2}}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}{\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}\frac{(8\rho^{\prime\prime 2})^{N/2}}{(2\pi)^{N/2}(-2\rho^{\prime})^{N/2}}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$
		$\displaystyle=\int_{D}\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$
		$\displaystyle=\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}\int_{D}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$

Next, we show the derivation from the third to the fourth line in the equation above. Since we assume $\nabla^{2}\theta(s)$ is a non-singular matrix at all critical points, then there exists an orthonormal matrix, denoted by $A(s)$ , such that $A(s)^{T}\nabla^{2}\theta(s)A(s)={\rm diag}\{\theta^{\prime\prime}_{1}(s),\theta^{\prime\prime}_{2},\ldots,\theta^{\prime\prime}_{N}(s)\}$ , where $\theta^{\prime\prime}_{1}\leq\ldots\leq\theta^{\prime\prime}_{N}(s)$ are ordered eigenvalues of $\nabla^{2}\theta(s)$ . On the other hand, GOI matrices are invariant under orthonormal transformations. By Lemma 3, the conditional expectation ${\mathbb{E}}[|{\rm det}(\nabla^{2}X(s))|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}|X(s)=x]$ is therefore

	$\displaystyle={\mathbb{E}}\left[\left\|{\rm det}\left(\sqrt{8\rho^{\prime\prime}}\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+\nabla^{2}\theta(s)\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle={\mathbb{E}}\left[\left\|{\rm det}\left(\sqrt{8\rho^{\prime\prime}}\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+A(s)^{T}\nabla^{2}\theta(s)A(s)\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle=(\sqrt{8\rho^{\prime\prime}})^{N}{\mathbb{E}}\left[\left\|{\rm det}\left(\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+A(s)^{T}\nabla^{2}\theta(s)A(s)/\sqrt{8\rho^{\prime\prime}}\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle=(\sqrt{8\rho^{\prime\prime}})^{N}{\mathbb{E}}\left[\left\|{\rm det}\left(G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}+{\rm diag}\{\theta^{\prime\prime}_{1}(s),\theta^{\prime\prime}_{2},\ldots,\theta^{\prime\prime}_{N}(s)\}/\sqrt{8\rho^{\prime\prime}}\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right].$		(16)

∎

The expression (15) can be simplified further if we further assume the mean function $\theta(s)$ to be a rotational symmetric paraboloid centered at $s_{0}$ . In this case, the Hessian of $\theta(s)$ is the identity matrix multiplied by a constant, i.e.

\theta^{\prime\prime}=\theta^{\prime\prime}_{1}(s)=\theta^{\prime\prime}_{2}(s)=...=\theta^{\prime\prime}_{N}(s).

Then we can write the mean function as $\theta(s)=\theta_{0}+\theta^{\prime\prime}\|s-s_{0}\|^{2}/2$ . Define

\eta=\frac{\theta^{\prime\prime}}{2\kappa\sqrt{\rho^{\prime\prime}}}=\frac{\theta^{\prime\prime}}{-2\rho^{\prime}}=\frac{\theta^{\prime\prime}}{{\rm Var}(Z_{1}(s))}.

(17)

and

H(\tilde{x})={\mathbb{E}}_{{\rm GOI}((1-\kappa^{2})/2)}^{N}\left[\prod_{j=1}^{N}\left|\lambda_{j}-\frac{\kappa\tilde{x}}{\sqrt{2}}\right|\mathbbm{1}_{\{\lambda_{N}<\frac{\kappa\tilde{x}}{\sqrt{2}}\}}\right].

(18)

${\mathbb{E}}[M_{u}]$ can be simplified as

\displaystyle{\mathbb{E}}[M_{u}]=

\displaystyle\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}\int_{D}e^{\frac{\theta^{\prime\prime 2}\|s-s_{0}\|^{2}}{4\rho^{\prime}}}\int_{\tilde{u}(s)}^{\infty}\phi\left(\tilde{x}+\eta\right)H(\tilde{x})d\tilde{x}ds

(19)

where we make a change of variable $\tilde{x}=x-\theta(s)-\eta$ and $\tilde{u}(s)=u-\theta(s)-\eta$ . Note that the parameter $\kappa$ depends on the correlation structure of $Z(s)$ .

3.3 Explicit formulas in 1D, 2D and 3D

In (19), a general formula for ${\mathbb{E}}[M_{u}]$ under isotropy was derived. To make the formula easier to apply in practice, we have the following results for computing it in 1D, 2D, and 3D. When $N=1$ , the derivation is simple enough that we do not need additional assumptions on the mean function $\theta(s)$ except those in Theorem 4, and it follows directly from Kac-Rice formula. When $N=2$ and $3$ , we assume the mean function $\theta(s)$ is a rotational symmetric paraboloid centered at $s_{0}$ . ${\mathbb{E}}[M_{u}]$ is calculated by first obtaining explicit formulas for $H(\tilde{x})$ , and plugging $H$ into (19).

Proposition 2.

Let $N=1$ , $X(s)=Z(s)+\theta(s)$ , where $Z(s)$ is a smooth zero-mean unit-variance Gaussian process and $\theta(s)$ is a smooth mean function. Assume additionally that $Z(s)$ is stationary, then

{\mathbb{E}}[M_{u}]=\int_{D}\frac{\sqrt{-2\rho^{\prime}(3-\kappa^{2})}}{\kappa}\phi\left(\frac{\theta^{\prime}(s)}{\sqrt{-2\rho^{\prime}}}\right)\int_{u}^{\infty}\phi(x-\theta(s))\psi\left(\frac{\kappa[x-\theta(s)-\eta(s)]}{\sqrt{3-\kappa^{2}}}\right)dx\,ds,

(20)

where the function $\psi$ is defined as

\psi(x)=\int_{-\infty}^{x}\Phi(y)dy=\phi(x)+x\Phi(x),\quad x\in\mathbb{R}.

Proof.

Since we assume that $Z(s)$ is stationary, $Z^{\prime}(s)$ is independent of $Z(s)$ and $Z^{\prime\prime}(s)$ , and $\rho^{\prime}=-{\rm Var}(Z^{\prime}(s))/2={\mathbb{E}}[Z(s)Z^{\prime\prime}(s)]/2$ and $\rho^{\prime\prime}={\rm Var}(Z^{\prime\prime}(s))/12$ do not depend on $s$ . Therefore,

{\rm Var}(X(s))=1,\quad{\rm Var}(X^{\prime}(s))=-{\rm Cov}[X(s)X^{\prime\prime}(s)]=-2\rho^{\prime}\quad{\rm and}\quad{\rm Var}(X^{\prime\prime}(s))=12\rho^{\prime\prime}.

Note that, by the formula of conditional Gaussian distributions,

X^{\prime\prime}(s)|X(s)=x\sim N(\theta^{\prime\prime}(s)+2\rho^{\prime}(x-\theta(s)),12\rho^{\prime\prime}-4\rho^{\prime 2}).

By the Kac-Rice formula

	$\displaystyle{\mathbb{E}}[M_{u}]$	$\displaystyle=\int_{D}p_{X^{\prime}(s)}(0){\mathbb{E}}[\|X^{\prime\prime}(s)\|\mathbbm{1}_{\{X(s)>u\}}\mathbbm{1}_{\{X^{\prime\prime}(s)<0\}}\|X^{\prime}(s)=0]ds$
		$\displaystyle=\int_{D}p_{X^{\prime}(s)}(0)\int_{u}^{\infty}\phi(x-\theta(s))\int_{-\infty}^{0}(-x^{\prime\prime})\frac{1}{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}\phi\left[\frac{x^{\prime\prime}-\theta^{\prime\prime}(s)-2\rho^{\prime}(x-\theta(s))}{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}\right]dx^{\prime\prime}\,dx\,ds$
		$\displaystyle=\int_{D}p_{X^{\prime}(s)}(0)\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}\int_{u}^{\infty}\phi(x-\theta(s))\psi\left(\frac{-2\rho^{\prime}(x-\theta(s))-\theta^{\prime\prime}(s)}{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}\right)dx\,ds$
		$\displaystyle=\int_{D}\frac{1}{\sqrt{-2\rho^{\prime}}}\phi\left(\frac{\theta^{\prime}(s)}{\sqrt{-2\rho^{\prime}}}\right)\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}\int_{u}^{\infty}\phi(x-\theta(s))\psi\left(\frac{-2\rho^{\prime}(x-\theta(s))-\theta^{\prime\prime}(s)}{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}\right)dx\,ds$
		$\displaystyle=\int_{D}\frac{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}{\sqrt{-2\rho^{\prime}}}\phi\left(\frac{\theta^{\prime}(s)}{\sqrt{-2\rho^{\prime}}}\right)\int_{u}^{\infty}\phi(x-\theta(s))\psi\left(\frac{-2\rho^{\prime}(x-\theta(s))-\theta^{\prime\prime}(s)}{\sqrt{12\rho^{\prime\prime}-4\rho^{\prime 2}}}\right)dx\,ds.$

The second to third line is due to the fact that

\int_{-\infty}^{0}(-x)\frac{1}{b}\phi\left(\frac{x+a}{b}\right)dx=\int_{-\infty}^{0}\Phi\left(\frac{x+a}{b}\right)dx=b\int_{-\infty}^{a/b}\Phi(y)dy=b\psi\left(\frac{a}{b}\right).

Recall the $\kappa$ (13) and $\eta$ (17) parameters defined before. We can rewrite ${\mathbb{E}}[M_{u}]$ as

\int_{D}\frac{\sqrt{-2\rho^{\prime}(3-\kappa^{2})}}{\kappa}\phi\left(\frac{\theta^{\prime}(s)}{\sqrt{-2\rho^{\prime}}}\right)\int_{u}^{\infty}\phi(x-\theta(s))\psi\left(\frac{\kappa[x-\theta(s)-\eta(s)]}{\sqrt{3-\kappa^{2}}}\right)dx\,ds.

This finishes the proof. ∎

Note that when $N=1$ ,

H(\tilde{x})=\phi\left(\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}}\right)+\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}}\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}}\right)=\psi\left(\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}}\right)\\

(21)

We need the following lemmas to calculate $H(\tilde{x})$ explicitly when $N=2$ and $N=3$ . They are direct calculation of integral by parts.

Lemma 4.

Let $N=2$ , for constant $a>-\frac{1}{2}$ and $b\in\mathbb{R}$

\int_{\mathbb{R}^{2}}\exp\bigg{\{}-\frac{1}{2}\sum_{i=1}^{2}\lambda_{i}^{2}-\frac{a}{2}\big{(}\sum_{i=1}^{2}\lambda_{i}\big{)}^{2}\bigg{\}}\left(\prod_{i=1}^{2}|\lambda_{i}-b|\right)|\lambda_{1}-\lambda_{2}|\mathbbm{1}_{\{\lambda_{1}<\lambda_{2}<b\}}d\lambda_{1}\,d\lambda_{2}\\ =\frac{\sqrt{2\pi}}{\sqrt{1+a}}e^{-\frac{1+2a}{2(1+a)}b^{2}}\Phi\left(\frac{1+2a}{\sqrt{1+a}}b\right)+\left(2b^{2}-\frac{1+4a}{1+2a}\right)\frac{\sqrt{\pi}}{\sqrt{1+2a}}\Phi(\sqrt{2(1+2a)}b)+\frac{b}{1+2a}e^{-(1+2a)b^{2}}

(22)

Lemma 5.

Let $N=3$ , for constant $a>0$ and $b\in\mathbb{R}$

		$\displaystyle\int_{\mathbb{R}^{3}}\exp\bigg{\{}-\frac{1}{2}\sum_{i=1}^{3}\lambda_{i}^{2}+\frac{a}{2(2+3a)}\big{(}\sum_{i=1}^{3}\lambda_{i}\big{)}^{2}\bigg{\}}\left(\prod_{i=1}^{3}\|\lambda_{i}-b\|\right)\prod_{1\leq i<j\leq 3}\|\lambda_{i}-\lambda_{j}\|\mathbbm{1}_{\{\lambda_{1}<\lambda_{2}<\lambda_{3}<b\}}d\lambda_{1}\,d\lambda_{2}\,d\lambda_{3}$
	$\displaystyle=$	$\displaystyle\left[\frac{a^{3}+6a^{2}+12a+24}{2(a+2)^{2}}b^{2}+\frac{2a^{3}+3a^{2}+6a}{4(a+2)}+\frac{3}{2}\right]\frac{1}{\sqrt{\pi(a+2)}}e^{-\frac{b^{2}}{a+2}}\Phi\left(\frac{2\sqrt{2}b}{\sqrt{(a+2)(3a+2)}}\right)$
		$\displaystyle+\left[\frac{a+1}{2}b^{2}+\frac{a^{2}-a}{2}-1\right]\frac{1}{\sqrt{\pi(a+1)}}e^{-\frac{b^{2}}{a+1}}\Phi\left(\frac{\sqrt{2}b}{\sqrt{(a+1)(3a+2)}}\right)$
		$\displaystyle+\left(a+6+\frac{3a^{3}+12a^{2}+28a}{2(a+2)}\right)\frac{b}{2\pi(a+2)\sqrt{3a+2}}e^{-\frac{3b^{2}}{3a+2}}$
		$\displaystyle+b\left[b^{2}+\frac{3(a-1)}{2}\right][\Phi_{\Sigma_{1}}(0,b)+\Phi_{\Sigma_{2}}(0,b)]$

where

\begin{split}\Sigma_{1}=\left(\begin{array}[]{cc}\frac{3}{2}&-1\\ -1&\frac{a+2}{2}\\ \end{array}\right),\quad\Sigma_{2}=\left(\begin{array}[]{cc}\frac{3}{2}&-\frac{1}{2}\\ -\frac{1}{2}&\frac{a+1}{2}\\ \end{array}\right).\end{split}

Proposition 3.

Let $N=2$ , and assumptions in Theorem 4 hold. Then the function $H$ defined in (18) can be written explicitly as

\displaystyle H(\tilde{x})=

\displaystyle\frac{\sqrt{2\pi}}{\sqrt{3-\kappa^{2}}}\phi\left(\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}}\right)\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{(2-\kappa^{2})(3-\kappa^{2})}}\right)+\frac{\kappa^{2}}{2}(\tilde{x}^{2}-1)\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{2-\kappa^{2}}}\right)+\frac{\kappa\sqrt{2-\kappa^{2}}\tilde{x}}{2}\phi\left(\frac{\kappa\tilde{x}}{\sqrt{2-\kappa^{2}}}\right)

(23)

Proof.

By Lemma 4 above with $a=(\kappa^{2}-1)/(4-2\kappa^{2})$ and $b=\kappa\tilde{x}/\sqrt{2}$ , and Lemma 2.2 in Cheng and Schwartzman (2018) ,

		$\displaystyle{\mathbb{E}}_{{\rm GOI}((1-\kappa^{2})/2)}^{N}\left[\prod_{j=1}^{N}\left\|\lambda_{j}-\frac{\kappa\tilde{x}}{\sqrt{2}}\right\|I\left(\lambda_{N}<-\frac{\kappa\tilde{x}}{\sqrt{2}}\right)\right]$
		$\displaystyle=\frac{1}{2\sqrt{\pi(2-\kappa^{2})}}\int_{\mathbb{R}^{2}}\exp\bigg{\{}-\frac{1}{2}(\lambda_{1}^{2}+\lambda_{2}^{2})+\frac{1-\kappa^{2}}{4(2-\kappa^{2})}(\lambda_{1}+\lambda_{2})^{2}\bigg{\}}(\lambda_{2}-\lambda_{1})$
		$\displaystyle\quad\left\|\lambda_{1}-\frac{\kappa\tilde{x}}{\sqrt{2}}\right\|\left\|\lambda_{2}-\frac{\kappa\tilde{x}}{\sqrt{2}}\right\|\mathbbm{1}_{\{\lambda_{2}<\frac{\kappa\tilde{x}}{\sqrt{2}}\}}d\lambda_{1}\,d\lambda_{2}$
		$\displaystyle=\frac{1}{\sqrt{3-\kappa^{2}}}e^{-\frac{\kappa^{2}\tilde{x}^{2}}{2(3-\kappa^{2})}}\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{(2-\kappa^{2})(3-\kappa^{2})}}\right)+\quad\frac{\kappa^{2}}{2}\left(\tilde{x}^{2}-1\right)\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{2-\kappa^{2}}}\right)+\frac{\kappa\sqrt{2-\kappa^{2}}\tilde{x}}{2\sqrt{2\pi}}e^{-\frac{\kappa^{2}\tilde{x}^{2}}{2(2-\kappa^{2})}}.$		(24)

This simplifies to (23).

∎

Proposition 4.

When $N=3$ , let assumptions in Theorem 4 hold. Then the function $H$ defined in (18) can be written explicitly as

	$\displaystyle H(\tilde{x})$	$\displaystyle=\Bigg{[}\frac{\kappa^{2}\left[(1-\kappa^{2})^{3}+6(1-\kappa^{2})^{2}+12(1-\kappa^{2})+24\right]}{4(3-\kappa^{2})^{2}}\tilde{x}^{2}$
		$\displaystyle\quad+\frac{2(1-\kappa^{2})^{3}+3(1-\kappa^{2})^{2}+6(1-\kappa^{2})}{4(3-\kappa^{2})}+\frac{3}{2}\Bigg{]}\frac{\phi(\frac{\kappa\tilde{x}}{\sqrt{3-\kappa^{2}}})}{\sqrt{\pi(3-\kappa^{2})}}\Phi\left(\frac{2\kappa\tilde{x}}{\sqrt{(3-\kappa^{2})(5-3\kappa^{2})}}\right)$
		$\displaystyle\quad+\left[\frac{\kappa^{2}(2-\kappa^{2})}{4}\tilde{x}^{2}-\frac{\kappa^{2}(1-\kappa^{2})}{2}-1\right]\frac{\phi(\frac{\kappa\tilde{x}}{\sqrt{2-\kappa^{2}}})}{\sqrt{\pi(2-\kappa^{2})}}\Phi\left(\frac{\kappa\tilde{x}}{\sqrt{(2-\kappa^{2})(5-3\kappa^{2})}}\right)$
		$\displaystyle\quad+\left[7-\kappa^{2}+\frac{(1-\kappa^{2})\left[3(1-\kappa^{2})^{2}+12(1-\kappa^{2})+28\right]}{2(3-\kappa^{2})}\right]\frac{\kappa\tilde{x}\phi(\sqrt{\frac{3}{5-3\kappa^{2}}}\kappa\tilde{x})}{2\sqrt{2}\pi(3-\kappa^{2})\sqrt{5-3\kappa^{2}}}$
		$\displaystyle\quad+\frac{\kappa^{3}}{2\sqrt{2}}\tilde{x}(\tilde{x}^{2}-3)\left[\Phi_{\Sigma_{1}}(0,\kappa\tilde{x}/\sqrt{2})+\Phi_{\Sigma_{2}}(0,\kappa\tilde{x}/\sqrt{2})\right],$

where

\begin{split}\Sigma_{1}=\left(\begin{array}[]{cc}\frac{3}{2}&-1\\ -1&\frac{3-\kappa^{2}}{2}\\ \end{array}\right),\quad\Sigma_{2}=\left(\begin{array}[]{cc}\frac{3}{2}&-\frac{1}{2}\\ -\frac{1}{2}&\frac{2-\kappa^{2}}{2}\\ \end{array}\right).\end{split}

Proof.

This is a direct result of Lemma 5 above with $a=1-\kappa^{2}$ and $b=\kappa\tilde{x}/\sqrt{2}$ . ∎

Note that for $N=2$ and $N=3$ , we need to solve an integral over the domain (see (19)) to get ${\mathbb{E}}{[M_{u}]}$ . Although we can not derive the explicit form for the entire formula, this can be evaluated in applications with the help of numerical algorithms.

3.4 Isotropic unimodal mean function

We have calculated the explicit formulas assuming the mean function is a concave paraboloid. This is a very strong assumption. However, in a general setting, where the unimodal mean function is rotationally symmetric of any shape, we can apply a multivariate Taylor expansion at the peak and use the second-order approximation to estimate power. For example, suppose the shape of the mean function is proportional to a rotational symmetric Gaussian density

\theta(s)=\theta_{0}\exp\left(-\frac{\|s-s_{0}\|^{2}}{2\xi^{2}}\right)

(25)

where $s_{0}$ is the center of the mean function and $\xi$ is the signal bandwidth. The Taylor expansion at the center is

\theta(s)=\theta_{0}-\frac{\theta_{0}}{2\xi^{2}}\|s-s_{0}\|^{2}+o(\|s-s_{0}\|^{2})

(26)

When the domain size gets small, we neglect the remainder term, and use its quadratic approximation as the mean function. With quadratic mean function, it becomes convenient to use compute ${\mathbb{E}}{[M_{u}]}$ . We will evaluate the performance of this approach for different domain sizes in the simulation study.

4 Simulations

In Section 2 above, we discussed power approximation under different scenarios. We showed the factorial moment ${\mathbb{E}}[M_{u}(M_{u}-1)]$ decays faster than ${\mathbb{E}}[M_{u}]$ under some circumstances so that we can use ${\mathbb{E}}[M_{u}]$ or adjusted ${\mathbb{E}}[M_{u}]$ to approximate power. In this section, a simulation study is conducted to validate each scenario as well as visualize the power function, ${\mathbb{E}}[M_{u}]$ , and adjusted ${\mathbb{E}}[M_{u}]$ . Through simulation, we could also get a better sense of applying them to real data.

4.1 Paraboloidal mean function

We generate $B=100,000$ centered, unit-variance, smooth isotropic 2D Gaussian random fields over a grid of size $50\times 50$ pixels as $Z(s)$ , each field obtained as the convolution of white Gaussian noise with a Gaussian kernel of spatial standard deviation 5, and normalized to standard deviation $\sigma=1$ . For the mean function $\mu(s)$ , we use a concave paraboloid centered at $s_{0}=(25,25)$ . The equation of the paraboloid is

\theta(s)=\theta_{0}-\frac{\|s-s_{0}\|^{2}}{2\xi^{2}}

(27)

where $\xi$ controls the sharpness of the mean function. The smaller $\xi$ is, the sharper the paraboloid will be. $\theta_{0}$ controls the height of the signal. To maintain the rotational symmetric property of $\theta(s)$ , we only consider those circles centered at $s_{0}$ as domain $D$ . The size of $D$ is measured by the radius $\text{Rad}(D)$ . The default value of each parameter is listed in Table unless otherwise specified.

Parameter	Default value
$\text{Rad}(D)$	10
$\xi$	7
$\theta_{0}$	3

Table 1: 2D simulation: default value of each parameter

The first two panels of Figure 1 display two instances of $\theta(s)$ and $Z(s)$ respectively. The third panel displays the resulting sum $X(s)$ which is calculated by the signal-plus-noise model.

In the simulation, we validate and visualize the scenarios presented above, and check the effect of different choices of parameters on the power function, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ . Four different scenarios are considered as we discussed in Section 2:

1.

Height equivariance
2.

Small domain size Rad(D)
3.

Large threshold $u$
4.

Sharp signal (small $\xi$ )

(a) Mean function $\theta(s)$

(b) Noise $Z(s)$

(c) Data $X(s)$

Figure 1: 2D simulation: a single instance of $\theta(s)$ , $Z(s)$ and their resulting $X(s)$ .

For each simulated random field, we record the height of its highest peak if there exists at least one, and then for any threshold $u$ , we calculate the empirical estimate of detection power (1) and ${\mathbb{E}}[M_{u}]$ :

\displaystyle\hat{{\mathbb{P}}}[M_{u}\geq 1]

\displaystyle=\frac{1}{B}\sum_{i=1}^{B}\mathbbm{1}(\exists\text{ a peak in D with height }>u\text{ for $i$th simulated sample}),

(28)

\displaystyle\hat{{\mathbb{E}}}[M_{u}]

\displaystyle=\frac{1}{B}\sum_{i=1}^{B}\text{{\#} peaks in $D$ with height }>u\text{ for $i$th simulated sample}.

(29)

Figure 2 displays the power, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ curves under the four scenarios. The first panel is to validate scenario 1 (height equivariance). As stipulated by Proposition 1, the power, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ curves are parallel for different signal height $h$ having other parameters remain the same. In the second panel, both the ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ curve are close to the power curve under scenario 2 (small domain) which indicates that for a smaller domain (quantified by Rad( $D$ )), using ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ to approximate power becomes more accurate as stipulated by Theorem 7. We can also find in all three panels that when $u$ is large enough, the ${\mathbb{E}}[M_{u}]$ curve converges to the power curve as $u$ increases, as stipulated by Theorem 2. The adjusted ${\mathbb{E}}[M_{u}]$ curve also converges to the power curve but with a slower rate compared to ${\mathbb{E}}[M_{u}]$ . The third panel shows the power, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ curve all converge to the Gaussian CDF for sharp signal (small $\xi$ ), as stipulated by Theorem 3.

Refer to caption — (a) Height equivariance

4.2 Constant mean function

When the mean function $\theta(s)$ is constant, i.e. it does not depend on location $s$ , $X(s)$ reduces to a centered isotropic Gaussian random field. Within the context of this paper, $\theta(s)=0$ can be seen as the null hypothesis and the power function becomes the probability of Type I error. We use the peak height distribution when $\theta(s)=0$ (Cheng and Schwartzman, 2018) to decide the cutoff point such that the test meets the nominal type I error. The simulation result when $\theta(s)=0$ is displayed in Figure 3.

The performance of Type I error approximation when the mean function is 0 is similar to what we find when the mean function is quadratic (scenario 4 is ignored since the shape parameter does not exist when the mean function is constant). The conclusion is that under large $u$ (which is guaranteed in order to control the Type I error) or small domain, we have good Type I error approximation.

4.3 Gaussian mean function

The simulation results under Gaussian mean are displayed in Figure 4. For scenario 2 and 3, the results are consistent with those under quadratic mean. For scenario 1, since $\theta_{0}$ controls both the signal height and sharpness, the power, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ are no longer equivariant in terms of $\theta_{0}$ . For scenario 4, if we look at Figure 4(c) with Figure 4(d), it can be seen that as the signal becomes sharper, the power, ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ curve converges to the Gaussian CDF only when the domain size (quantified by Rad( $D$ )) is small. In this case, the asymptotic curve is a mixture of Gaussian CDF and ${\mathbb{E}}[M_{u}]$ under constant mean. This is due to the shape of Gaussian density as it converges to 0 if we expand the domain.

In conclusion, for Gaussian mean function, we recommend applying our method to approximate power only when the domain size is small.

5 Estimation from data

To use our power approximation formula in real peak detection problems, we need to estimate the spatial covariance function of the noise as well as the mean function from the data. In this section, we demonstrate the 3D application setting and how to estimate the noise spatial covariance function and the mean function. Consider an imaging dataset with $n$ subjects, and let $Y_{i}(s)$ represent the signal plus noise for subject $i$

Y_{i}(s)=\mu(s)+\sigma(s)\varepsilon_{i}(s)

where $s=(s_{1},s_{2},s_{3})^{\prime}\in\mathbb{R}^{3}$ , the signal $\mu(s)$ , standard deviation $\sigma(s)$ and noise $\varepsilon(s)$ are assumed to be smooth $C^{3}$ functions. If we compute the standardized mean of all $n$ subjects, we will get a standardized random field

X(s)=\bar{Y}(s)/\text{SE}(\bar{Y}(s))=\sqrt{n}\mu(s)/\sigma(s)+\sqrt{n}\bar{\varepsilon}(s)

(30)

This standardized random field $X(s)$ has constant standard deviation of 1. We can treat $\sqrt{n}\mu(s)/\sigma(s)$ as the new signal and $\sqrt{n}\bar{\varepsilon}(s)$ as the new noise of the standardized field. Let $\theta(s)=\sqrt{n}\mu(s)/\sigma(s)$ and $Z(s)=\sqrt{n}\bar{\varepsilon}(s)$ . We propose using the following method to estimate the new signal and noise.

5.1 Estimation of the noise spatial covariance function

We consider the noise $Z(s)$ to be constructed by convolving Gaussian white noise with a kernel:

Z(s)=\int_{\mathbb{R}^{N}}K(t-s)dB(t)

(31)

where $K(\cdot)$ is a $N$ dimensional kernel function, and $dB(s)$ is Gaussian white noise. Assume that the kernel is rotationally symmetric so that the noise $Z(s)$ is isotropic. Under model (31), we would be able to simulate the noise if we were able to estimate the kernel function from the data.

It can be shown that the autocorrelation of $Z(s)$ is the convolution of the kernel with itself:

{\rm Cor}(Z(s),Z(s^{\prime}))=\int_{\mathbb{R}^{N}}K(t-s)K(t-s^{\prime})dt=\int_{\mathbb{R}^{N}}K(t-(s-s^{\prime}))K(t)dt.

By the convolution theorem, convolution in the original domain equals point-wise multiplication in the Fourier-transformed domain. Thus the kernel function can be estimated empirically using the following method:

1.

Determine a location $s_{0}$ of interest (e.g. center of the peak), and calculate the empirical correlation vectors between $Y(s_{0})$ and $Y(s)$ where $s$ lies on the three orthogonal axes centered at $s_{0}$ , and belongs to a subdomain of interest.
2.

Take the average of the three estimated correlation vectors (forcing the noise to be isotropic) and perform Fourier transform.
3.

Take the square root of the Fourier coefficients, then the estimated kernel function can be obtained by performing the inverse Fourier transform.

5.2 Estimation of the mean function

Our explicit formulas are derived assuming the Hessian of the mean function is a constant times the identity matrix. Therefore, we aim to find a rotational symmetric paraboloid $\hat{\theta}(s)$ that best represents the mean function:

\theta(s)=\beta_{0}+\beta_{1}||s||^{2}+(\beta_{2},\beta_{3},\beta_{4})\cdot s

(32)

where the dot represents the vector inner product in $\mathbb{R}^{3}$ . Note that all the quadratic terms share the same coefficient which is due to the rotational symmetry. To estimate (32), we can fit a linear regression using all $X(s)$ within the subdomain as outcome.

6 A 3D real data example

As an application, we illustrate the methods in a group analysis of fMRI data from the Human Connectome Project (HCP) (Van Essen et al., 2012). The data is used here to get realistic 3D signal and noise parameters from which to do 3D simulations as well as evaluate the performance of our formulas in power approximation. The dataset contains fMRI brain scans for 80 subjects. For each subject, the size of the fMRI image is $91\times 109\times 91$ voxels. The mean and standard deviation of the data are displayed in Figure 5(a) and 5(b).

6.1 Data preprocessing

Gaussian kernel smoothing is applied to the dataset to make the mean function unimodal around the peak and increase the signal-to-noise ratio. The standard deviation of the smoothing kernel we use in this example is 1 voxel which translates to full width at half maximum (FWHM) being around 2.235. It is obvious from Figure 5(b) that the standard deviation of the noise is not a constant for different locations, thus we use the transformation described in Section 5 to standardize the smoothed data before analyzing it. The standardized mean of the smoothed data $X(s)$ is displayed in Figure 5(c).

6.2 Estimation of the autocorrelation and mean functions

After standardizing the data, our next step is to estimate the mean and kernel functions using the methods described in Section 5. Here a $15\times 15\times 15$ subdomain (the red box in Figure 5(c)) around the peak is taken to estimate the kernel. Since we assume the noise is isotropic, the correlation around the peak is supposed to be strictly symmetric along any dimension. However, this is not always true in real data. To tackle this, for each of the three dimensions, we first save the correlation ${\rm Cor}(X(s),X(s_{0}))$ around the peak $s_{0}$ as a vector and create a new vector by flipping the saved correlation vector. Then we take the mean of the two vectors so that it is guaranteed to be symmetric. The empirical correlation after such symmetrization and the corresponding estimated kernel function are displayed in Figure 6.

We consider two approaches to estimate the mean function, nonparametric and parametric. The nonparametric mean estimation is obtained as a voxelwise average over subjects.

\hat{\theta}(s)=\sum_{i=1}^{n}X_{i}(s)=\bar{X}(s)

(33)

The parametric mean estimation is obtained by fitting a linear regression model (32) using all observed data $X(s)$ within the subdomain of size $6\times 6\times 6$ (the blue box in Figure 5(c)) as outcome and their corresponding location variables $||s||^{2}$ , $s$ as covariates. The least square estimate of the mean is

\hat{\theta}(s)=13.03-0.26||s||^{2}+(0.20,0.11,0.39)\cdot s

(34)

We will compare the difference in simulated power and ${\mathbb{E}}[M_{u}]$ when the mean function is estimated by the nonparametric approach (33) vs the parametric approach (34).

6.3 3D Simulation induced by data

We have done several simulation studies under a well-designed 2D setting where the formulas are supposed to work well, but eventually, we want to apply the formulas to real-life data which is more complicated. Besides, in terms of fMRI data analysis, the image is always 3D by nature. Considering all these factors, a 3D simulation study induced by real data is necessary to validate the performance of the formulas under a more realistic setting,

In the previous two subsections, we have studied the signal and noise of the HCP data. For the simulation, we would like to generate 3D images using the estimated mean and kernel function. The noise field is generated by convolving the estimated kernel (displayed in Figure 6) and Gaussian white noise. For each simulation setting, 10,000 such noise fields are generated.

The signal from the standardized data is very strong (see Figure 5(c)). For illustrative purposes, we choose to weaken the signal by scaling down the estimated mean function (34) while maintaining the same shape. Here signal strength is measured by effect size, and the amount of scaling is determined by different levels of effect size. In traditional t-test or z-test, Cohen’s d values of 0.2, 0.5, and 0.8 (corresponding to 0.58 th 0.69 th and 0.79 th quantiles of the standard Gaussian distribution) are considered as small, medium, and large effect sizes (Cohen, 1988). The peak height distribution under the null hypothesis (zero mean function) is displayed in Figure 7, and does not follow a Gaussian distribution. Therefore, we take 0.58 th, 0.69 th and 0.79 th quantile of the null distribution minus the mean as small (0.16), medium (0.40), and large (0.65) effect size (see the black dash-dot lines in Figure 7). For simplicity, we see the peak height of the mean function as effect size in this simulation. However, this is not the most accurate way of defining effect size in the peak detection setting. More details will be discussed in 7.2. The threshold $u$ for peak detection is chosen as the 0.99 th quantile of the peak height distribution under the null ( $\approx 3.42$ ) according to Cheng and Schwartzman (2017) (see the red dashed line in Figure 7).

Similar to the 2D simulation, the search domain $D$ is assumed to be a sphere centered at the true peak, and we use radius of $D$ to control the domain size. Signal sharpness is fixed since it is estimated from the data. The empirical power and ${\mathbb{E}}[M_{u}]$ are computed using (28) and (29).

To derive the explicit formulas for ${\mathbb{E}}[M_{u}]$ , we assume the mean function to be a rotational symmetric paraboloid, and this assumption might cause some bias in applications. In Figure 8, we demonstrate the difference in simulated power and ${\mathbb{E}}[M_{u}]$ when the mean function is estimated by the nonparametric approach (33) vs the parametric approach (34). It can be observed that the quadratic approximation only has a small impact on power and ${\mathbb{E}}[M_{u}]$ in this example. In Figure 9, we validate our theoretical formula for ${\mathbb{E}}[M_{u}]$ (19) as well as the adjusted ${\mathbb{E}}[M_{u}]$ . As we can see, the theoretical curve for ${\mathbb{E}}[M_{u}]$ and adjusted ${\mathbb{E}}[M_{u}]$ closely mirrors the empirical curve. The figure also shows that the power approximation using ${\mathbb{E}}[M_{u}]$ is accurate for large $u$ as stipulated by Theorem 2. Power curves using three different effect sizes, and comparisons between large and small domain sizes are displayed in Figure 10. We can see from the figure that the approximation works well for small and large sample sizes, and ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ provides a conservative determination of the sample when ${\mathbb{E}}[M_{u}]$ exceeds 1. We can also observe that the performance of power approximation using ${\mathbb{E}}[M_{u}]$ becomes better if the domain size is smaller as stipulated by Theorem 7.

7 Discussion

7.1 Explicit formulas and approximations

Calculating power for peak detection (1) has been a difficult problem in random field theory due to the lack of formula that can compute it directly. In this paper, we have discussed the rationale of using ${\mathbb{E}}[M_{u}]$ and ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ to approximate peak detection power under different scenarios and derived formulas to compute ${\mathbb{E}}[M_{u}]$ assuming isotropy. Isotropy is assumed so that we are able to use the GOI matrix (Cheng and Schwartzman, 2018) as a tool to calculate ${\mathbb{E}}[M_{u}]$ via the Kac-Rice formula.

We also showed explicit formulas for $H(\tilde{x})$ (defined as (18)) when $N=1,2,3$ assuming the mean function is a paraboloid. Computing $H(\tilde{x})$ involves applying the probability density function for the eigenvalues of GOI matrices and details can be found in the proof of Proposition 2, 3 and 4. Then ${\mathbb{E}}[M_{u}]$ can be calculated by plugging $H(\tilde{x})$ to (19). The integration in (19), however, can not be evaluated explicitly. In practice, one may evaluate it numerically. For higher dimensions ( $N>3$ ), it remains difficult to get an explicit form of $H(\tilde{x})$ due to the fact inferred by Proposition 2, 3 and 4 that the integration becomes extremely complicated as $N$ becomes large.

7.2 Effect size

We want to emphasize that the power depends on both the signal strength parameter $\theta_{0}$ and shape parameter $\eta$ . In a traditional z-test or t-test which tests a single null hypothesis that the mean value equal to 0, the detection power depends only on a single parameter we call effect size. Here the test is conditional on the point being a local maximum. Applying a simple z-test or t-test, one could reject the null hypothesis as long as the peak height $\theta_{0}$ exceeds the pre-specified threshold. This approach is not accurate since the peak height does not follow a Gaussian or t distribution. To address this, the threshold can be determined by the null distribution of peak height (Cheng and Schwartzman, 2018) to control the type I error at a nominal level. However, power calculation based on the test over peak height is still biased since the true effect size depends both on the signal height and curvature. The height of the peak affects the likelihood of exceeding the threshold and the curvature affects the likelihood of existing such peak in the domain. It follows that a sharp and high peak is easier to detect compared to a flat and low peak, leading to a larger detection power.

For an interpretation of the parameter $\eta=\theta^{\prime\prime}/(-2\rho^{\prime})$ , we consider two types of mean function: paraboloid and Gaussian. Suppose the noise is the result of the convolution of white noise with a Gaussian kernel with spatial std. dev. $\nu$ resulting in the covariance function with $\rho(r)=\exp(-r/(2\nu^{2}))$ as specified in Section 3.1. This is the same noise as we simulated in Section 5. When the mean function is paraboloid, consider $\theta(s)=-\|s\|^{2}/(2\xi^{2})+\theta_{0}$ as in (27). Here we obtain $\theta^{\prime\prime}=-1/\xi^{2}$ and $\rho^{\prime}=-1/(2\nu^{2})$ , yielding $\eta=\theta^{\prime\prime}/(-2\rho^{\prime})=-\nu^{2}/\xi^{2}$ . Thus, $\eta$ is a shape parameter representing the relative sharpness of the mean function with respect to the curvature of the noise. When the mean function is Gaussian, consider $\theta(s)=a\exp(-\|s\|^{2}/(2\tau^{2}))$ . This expression is obtained, for example, if the signal is the result of the convolution of a delta function with a Gaussian kernel with spatial std. dev. $\tau$ . We obtain $\theta^{\prime\prime}=-a/\tau^{2}$ and $\rho^{\prime}=-1/(2\nu^{2})$ , yielding $\eta=\theta^{\prime\prime}/(-2\rho^{\prime})=-a\nu^{2}/\tau^{2}$ . Thus, $\eta$ is the height of the signal a, scaled by the ratio of the spatial extent of the noise and signal filters. In both cases, the parameter $\eta$ , and thus the power, are invariant under isotropic scaling of the domain, in a similar fashion to the peak height distribution under the null hypothesis (Cheng and Schwartzman, 2020).

Figure 11 illustrates how $\theta_{0}$ and $\eta$ affect power and ${\mathbb{E}}[{M_{u}}]$ in the 2D simulation described in Section 4. As we have explained, $\theta_{0}$ and $\eta$ together determine the effect size. Although deriving an explicit form of effect size as a function of $\theta_{0}$ and $\eta$ is difficult, we are able to roughly show how the two parameters relate to power. $\theta_{0}$ which can be seen as signal-to-noise ratio (SNR) plays a major role. Having $\eta$ stay the same, the power monotonically increases with respect to $\theta_{0}$ . On the other hand, power monotonically decreases with respect to $\eta$ having $\theta_{0}$ stays the same. In this simulation example, the impact of $\theta_{0}$ on power is about 10 times stronger than $\eta$ if we fit a linear model of power using $\theta_{0}$ and $\eta$ . We can also observe from the figure that the effect of $\eta$ on power is stronger for large $\theta_{0}$ compared to that for small $\theta_{0}$ .

7.3 Application to data

To use our formula to calculate power in practice, one needs to assume the peak to be a certain type such as paraboloid or Gaussian. However, sometimes it might not be plausible to make such assumptions, leading to inaccurate power estimate.

Regarding the conjecture of ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ being a lower bound when there exists at least one local maximum in the domain $D$ , it remains difficult to prove in general, but as we showed in the real data example, it seems to be correct in practice. When it comes to a real-life problem, we can take both ${\mathbb{E}}[M_{u}]$ and the ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ into consideration to get a better understanding of the true sample size. We suggest using ${\mathbb{E}}{[M_{u}]}$ as an approximation to power when the sample size is small, considering ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ when the sample size is large. ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ also gives a more conservative estimate of power compared to ${\mathbb{E}}[M_{u}]$ which is useful to guarantee that the test is powerful enough when we design future studies. Because of its difficulty, we leave further study of ${\mathbb{E}}[M_{u}]_{{\rm adj}}$ for future work.

8 Acknowledgments

Y.Z., D.C. and A.S. were partially supported by NIH grant R01EB026859 and NSF grant 1811659. Data were provided in part by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

9 Appendix

References

Adler and Taylor (2007) R. J. Adler, J. E. Taylor, Random Fields and Geometry., New York: Springer, 2007.
Cheng and Schwartzman (2015) D. Cheng, A. Schwartzman, Distribution of the height of local maxima of gaussian random fields., Extremes 18 (2015) 213–240.
Cheng and Schwartzman (2017) D. Cheng, A. Schwartzman, Multiple testing of local maxima for detection of peaks in random fields, Annals of Statistics 45(2) (2017) 529–556.
Cheng and Schwartzman (2018) D. Cheng, A. Schwartzman, Expected number and height distribution of critical points of smooth isotropic gaussian random fields., Bernoulli 24(4B) (2018) 3422–3446.
Cheng and Schwartzman (2020) D. Cheng, A. Schwartzman, On critical points of gaussian random fields under diffeomorphic transformations, Statistics & Probability Letters 158 (2020) 108672.
Cohen (1988) J. Cohen, Statistical Power Analysis for the Behavioral Sciences, Routledge, 1988.
Durnez et al. (2016) J. Durnez, J. Degryse, B. Moerkerke, R. Seurinck, V. Sochat, R. A. Poldrack, T. E. Nichols, Power and sample size calculations for fmri studies based on the prevalence of active peaks. (2016).
Genovese et al. (2002) C. R. Genovese, N. A. Lazar, T. Nichols, Thresholding of statistical maps in functional neuroimaging using the false discovery rate, Neuroimage 15 (2002) 870–878.
Heller et al. (2006) R. Heller, D. Stanley, D. Yekutieli, N. Rubin, Y. Benjamini, Cluster-based analysis of FMRI data, Neuroimage 33 (2006) 599–608.
Piterbarg (1996) V. I. Piterbarg, Rice’s method for large excursions of gaussian random fields., Technical Report NO 478. Center for Stochastic Processes, Univ. North Carolina (1996).
Schwartzman et al. (2008) A. Schwartzman, W. Mascarenhas, J. Taylor, Inference for eigenvalues and eigenvectors of gaussian symmetric matrices., Annals of Statistics 36(6) (2008) 2886–2919.
Schwartzman and Telschow (2019) A. Schwartzman, F. Telschow, Peak p-values and false discovery rate inference in neuroimaging., NeuroImage 197 (2019) 402–413.
Van Essen et al. (2012) D. C. Van Essen, K. Ugurbil, E. Auerbach, D. Barch, T. E. J. Behrens, R. Bucholz, A. Chang, L. Chen, M. Corbetta, S. W. Curtiss, S. Della Penna, D. Feinberg, M. F. Glasser, N. Harel, A. C. Heath, L. Larson-Prior, D. Marcus, G. Michalareas, S. Moeller, R. Oostenveld, S. E. Petersen, F. Prior, B. L. Schlaggar, S. M. Smith, A. Z. Snyder, J. Xu, E. Yacoub, WU-Minn HCP Consortium, The human connectome project: a data acquisition perspective, Neuroimage 62 (2012) 2222–2231.

	$\displaystyle\int_{u}^{\infty}{\mathbb{E}}_{1}(s,t)P_{X(s)}\left(x\middle\|\nabla X(s)=\nabla X(t)=0\right)dx$
	$\displaystyle\leq C_{1}\|\|t-s\|\|^{\alpha}\int_{u}^{\infty}P_{X(s)}\left(x\middle\|\nabla X(s)=\nabla X(t)=0\right)dx$
	$\displaystyle=C_{1}\|\|t-s\|\|^{\alpha}\int_{u}^{\infty}\exp(-(Ax-B)^{2})dx\quad\text{for some constant $A$, $B$}$
	$\displaystyle=C_{2}\|\|t-s\|\|^{\alpha}$

	$\displaystyle{\mathbb{E}}[M_{u}]$	$\displaystyle=\int_{D}p_{\nabla X(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}p_{\nabla Z(s)+\nabla\theta(s)}(0){\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}\frac{1}{(2\pi)^{N/2}(-2\rho^{\prime})^{N/2}}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}{\mathbb{E}}\left[\|{\rm det}\nabla^{2}X(s)\|\mathbbm{1}_{\{\nabla^{2}X(s)\prec 0\}}\mathbbm{1}_{\{X(s)>u\}}\|\nabla X(s)=0\right]ds$
		$\displaystyle=\int_{D}\frac{(8\rho^{\prime\prime 2})^{N/2}}{(2\pi)^{N/2}(-2\rho^{\prime})^{N/2}}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$
		$\displaystyle=\int_{D}\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$
		$\displaystyle=\bigg{(}\frac{2\rho^{\prime\prime}}{-\pi\rho^{\prime}}\bigg{)}^{N/2}\int_{D}e^{\frac{\\|\nabla\theta(s)\\|^{2}}{4\rho^{\prime}}}\int_{u}^{\infty}\phi\left({x-\theta(s)}\right){\mathbb{E}}\left[\|{\rm det}({\rm Matrix}(s))\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]dx\,ds$

	$\displaystyle={\mathbb{E}}\left[\left\|{\rm det}\left(\sqrt{8\rho^{\prime\prime}}\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+\nabla^{2}\theta(s)\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle={\mathbb{E}}\left[\left\|{\rm det}\left(\sqrt{8\rho^{\prime\prime}}\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+A(s)^{T}\nabla^{2}\theta(s)A(s)\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle=(\sqrt{8\rho^{\prime\prime}})^{N}{\mathbb{E}}\left[\left\|{\rm det}\left(\left[G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}\right]+A(s)^{T}\nabla^{2}\theta(s)A(s)/\sqrt{8\rho^{\prime\prime}}\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right]$
	$\displaystyle=(\sqrt{8\rho^{\prime\prime}})^{N}{\mathbb{E}}\left[\left\|{\rm det}\left(G-\frac{\kappa(x-\theta(s))}{\sqrt{2}}I_{N}+{\rm diag}\{\theta^{\prime\prime}_{1}(s),\theta^{\prime\prime}_{2},\ldots,\theta^{\prime\prime}_{N}(s)\}/\sqrt{8\rho^{\prime\prime}}\right)\right\|\mathbbm{1}_{\{{\rm Matrix}(s)\prec 0\}}\right].$		(16)

An approximation to peak detection power using Gaussian random field theory

Abstract

1 Introduction

2 Power Approximation

2.1 Setup

Assumption 1.

Assumption 2.

2.2 Peak detection

2.3 Power approximation

2.4 Adjusted 𝔼​[Mu]{\mathbb{E}}[M_{u}]

2.5 Height equivariance

Proposition 1.

Proof.

2.6 Small domain

Theorem 1.

Proof.

2.7 Large threshold

Theorem 2.

Proof.

2.8 Sharp signal

Theorem 3.

Proof.

3 Explicit formulas

3.1 Isotropic Gaussian fields

Lemma 1.

Lemma 2.

Lemma 3.

Proof.

3.2 General formula under isotropy

Theorem 4.

Proof.

3.3 Explicit formulas in 1D, 2D and 3D

Proposition 2.

Proof.

Lemma 4.

Lemma 5.

Proposition 3.

Proof.

Proposition 4.

Proof.

3.4 Isotropic unimodal mean function

4 Simulations

4.1 Paraboloidal mean function

4.2 Constant mean function

4.3 Gaussian mean function

5 Estimation from data

5.1 Estimation of the noise spatial covariance function

5.2 Estimation of the mean function

6 A 3D real data example

6.1 Data preprocessing

6.2 Estimation of the autocorrelation and mean functions

6.3 3D Simulation induced by data

7 Discussion

7.1 Explicit formulas and approximations

7.2 Effect size

7.3 Application to data

8 Acknowledgments

9 Appendix

References

2.4 Adjusted ${\mathbb{E}}[M_{u}]$