Fourier analysis of spatial point processes

Junho Yang Email: junhoyang@stat.sinica.edu.tw Institute of Statistical Science, Academia Sinica Yongtao Guan Email: guanyongtao@cuhk.edu.cn Shenzhen Research Institute of Big Data, School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)

Abstract

In this article, we develop comprehensive frequency domain methods for estimating and inferring the second-order structure of spatial point processes. The main element here is on utilizing the discrete Fourier transform (DFT) of the point pattern and its tapered counterpart. Under second-order stationarity, we show that both the DFTs and the tapered DFTs are asymptotically jointly independent Gaussian even when the DFTs share the same limiting frequencies. Based on these results, we establish an $\alpha$ -mixing central limit theorem for a statistic formulated as a quadratic form of the tapered DFT. As applications, we derive the asymptotic distribution of the kernel spectral density estimator and establish a frequency domain inferential method for parametric stationary point processes. For the latter, the resulting model parameter estimator is computationally tractable and yields meaningful interpretations even in the case of model misspecification. We investigate the finite sample performance of our estimator through simulations, considering scenarios of both correctly specified and misspecified models.

Keywords and phrases: Bartlett’s spectrum, data tapering, discrete Fourier transform, inhomogeneous point processes, stationary point processes, Whittle likelihood.

1 Introduction

Spatial point patterns, which are collections of events in space, are increasingly common across various disciplines, including seismology (Zhuang et al. (2004)), epidemiology (Gabriel and Diggle (2009)), ecology (Warton and Shepherd (2010)), and network analysis (D’Angelo et al. (2022)). A common assumption when analyzing such point patterns is that the underlying spatial point process is second-order stationary or second-order intensity reweighted stationary (Baddeley et al. (2000)). Under these assumptions, a majority of estimation procedures for the second-order structure of a spatial point process can be conducted through well-established tools in the spatial domain, such as the pair correlation function and $K$ -function (Illian et al. (2008); Waagepetersen and Guan (2009)). For a comprehensive review of spatial domain approaches, we refer the readers to Møller and Waagepetersen (2004), Chapter 4.

However, considerably less attention has been devoted to estimations in the frequency domain. In his pioneering work, Bartlett (1964) defined the spectral density function of a second-order stationary point process in two-dimensional space and proposed using the periodogram, a squared modulus of the discrete Fourier transform (DFT), as an estimator of the spectral density. For practical implementations, Mugglestone and Renshaw (1996) provided a guide to using periodograms with illustrative examples. Rajala et al. (2023) derived detailed calculations for the first- and second-order moments of the DFTs and periodograms for fixed frequencies. Despite these advances, theoretical properties of DFTs and periodograms remain largely unexplored. For example, fundamental properties for the DFTs of time series, such as asymptotic uncorrelatedness and asymptotic joint normality of the DFTs (cf. Brillinger (1981), Chapters 4.3 and 4.4), are yet to be rigorously investigated in the spatial point process setting.

One inherent challenge in conducting spectral analysis of spatial point processes is that spatial point patterns are irregularly scattered. As such, theoretical tools designed for time series data or spatial gridded data cannot be readily extended to the spatial point process setting. One potential solution is to discretize the (spatial or temporal) point pattern using regular bins. This approach allows the application of classical spectral methods from the “regular” time series or random fields to the aggregated count data. For example, Cheysson and Lang (2022) developed a frequency domain parameter estimation method for the one-dimentional stationary binned Hawkes process (refer to Section 5.1 below for details on the Hawkes process). See also Shlomovich et al. (2022) for the use of the binned Hawkes process to estimate the parameters in the spatial domain. However, aggregating events may introduce additional errors, and there is no theoretical result for binned count processes beyond the stationary Hawkes process case.

In this article, instead of focusing on the discretized count data, we aim to present a new frequency domain approach for spatial point processes utilizing the “complete” information in the process. In Section 2, we cover relevant terminologies (Section 2.1), review the concept of the DFT and periodogram incorporating data tapering (Section 2.2), and provide features contained in the spectral density functions and periodograms with illustrations (Section 2.3). Building on these concepts, we show in Section 3.1 that under an increasing domain framework and an $\alpha$ -mixing condition, the DFTs are asymptotically jointly Gaussian, even when the DFTs share the same limiting frequencies. Therefore, our asymptotic results extend those of Rajala et al. (2023), who considered only fixed frequencies, enabling us to quantify statistics written in terms of the integrated periodogram (we will elaborate on this below).

A crucial aspect of showing asymptotic joint normality of the DFTs is utilizing spectral analysis tools for irregularly spaced spatial data. Matsuda and Yajima (2009) proposed a novel framework to define the DFT for irregularly spaced spatial data, where the observation locations are generated from a probability distribution on the observation domain. Intriguingly, the DFTs for spatial point processes and those for irregularly spaced spatial data exhibit similar structures. Therefore, tools developed for irregularly spaced spatial data, such as those by Bandyopadhyay and Lahiri (2009) and Subba Rao (2018), are also useful in spatial point process setting.

Despite the aforementioned similarities, it is important to note that the stochastic mechanisms generating spatial point patterns and irregularly spaced spatial data are very different. The former considers the number of events in a fixed area being random, while the latter determines a (deterministic) number of sampling locations at which the random field is observed. Moreover, from a technical point of view, unlike the random field case, the spectral density function $f(\boldsymbol{\omega})$ of the spatial point process, as given in (2.5) below, is not absolutely integrable. Therefore, the interchange of summations in the expansions of covariances and cumulants of the DFTs is not straightforward. To reconcile the differences between the spatial point process and irregularly spaced spatial data settings, we introduce in Sections 3.1 and 4 several new assumptions tailored to the spatial point process setting. In Section 5, we verify these assumptions for four widely used point process models, namely the Hawkes process, Neyman-Scott point process, log-Gaussian Cox process, and determinantal point process.

Expanding on the theoretical properties of the DFTs for spatial point processes, we also consider parameter estimations. Our main interest lies in parameters expressed in terms of the spectral mean of the form $\int_{D}\phi(\boldsymbol{\omega})f(\boldsymbol{\omega})d\boldsymbol{\omega}$ , where $D\subset\mathbb{R}^{d}$ is a prespecified compact region and $\phi(\cdot)$ is a continuous function on $D$ . To estimate the spectral mean, we employ the integrated periodogram as defined in (4.1) below. Parameters and estimators in this nature were first considered in Parzen (1957) and have since garnered great attention in the time series literature, given that both the kernel spectral density and autocovariance estimator take this general form. In Section 4, we derive the central limit theorem (CLT) for the integrated periodogram under an $\alpha$ -mixing condition. We note that since the integrated periodogram is written as a quadratic form of the DFTs, one cannot directly use the standard techniques to show the CLT for $\alpha$ -mixed point processes, as reviewed in Biscio and Waagepetersen (2019), Section 1. Instead, we use a series of approximation techniques to prove the CLT for the integrated periodogram. See Appendices A.4 and F in the Supplementary Material for details. As a direct application, in Theorem 4.2, we derive the asymptotic distribution of the kernel spectral density estimator.

Another major application of the integrated periodogram is the model parameter estimation. Whittle (1953) introduced the periodogram-based approximation of the Gaussian likelihood for stationary time series. Subsequently, the concept of Whittle likelihood was extended to lattice (Guyon (1982); Dahlhaus and Künsch (1987)) and irregularly spaced spatial data (Matsuda and Yajima (2009); Subba Rao (2018)). In Section 6, we develop a procedure to fit parametric spatial point process models based on the Whittle-type likelihood (hereafter, just Whittle likelihood) and obtain sampling properties of the resulting estimator. A noteworthy aspect of our estimator is that it not only estimates the true parameter when the model is correctly specified but also estimates the best fitting parameter when the model is misspecified, where “best” is defined in terms of the spectral divergence criterion. While misspecified first-order intensity models have been considered (e.g., Choiruddin et al. (2021)), as far as we aware, our result is the first attempt that studies both the first- and second-order model misspecifications for (stationary) spatial point processes. In Section 7, we compare the performances of our estimator and two existing estimation methods in the spatial domain through simulations.

Lastly, proofs, auxiliary results, and additional simulations can be found in the Supplementary Material, Yang and Guan (2024) (hereafter, just Appendix).

2 Spectral density functions for the second-order stationary point processes

2.1 Preliminaries

In this section, we introduce the notation used throughout the article and review terminologies related to the mathematical presentation of spatial point processes.

Let $d\in\mathbb{N}$ and let $\mathbb{R}$ and $\mathbb{C}$ be the real and complex fields, respectively. For a set $A$ , $n(A)$ denotes the cardinality of $A$ and $A^{n,\neq}$ ( $n\in\mathbb{N}$ ) denotes a set containing all $n$ -tuples of pairwise disjoint points in $A$ . For a vector $\boldsymbol{v}=(v_{1},\dots,v_{d})^{\top}\in\mathbb{C}^{d}$ , $|\boldsymbol{v}|=\sum_{j=1}^{d}|v_{j}|$ , $\|\boldsymbol{v}\|=\{\sum_{j=1}^{d}|v_{j}|^{2}\}^{1/2}$ , and $\|\boldsymbol{v}\|_{\infty}=\max_{1\leq j\leq d}|v_{j}|$ denote the $\ell_{1}$ norm, Euclidean norm, and maximum norm, respectively. For vectors $\boldsymbol{u}=(u_{1},\dots,u_{d})^{\top}$ and $\boldsymbol{v}=(v_{1},\dots,v_{d})^{\top}$ in $\mathbb{R}^{d}$ , $\boldsymbol{u}\cdot\boldsymbol{v}=(u_{1}v_{1},\dots,u_{d}v_{d})^{\top}$ and $\boldsymbol{u}/\boldsymbol{v}=(u_{1}/v_{1},\dots,u_{d}/v_{d})^{\top}$ , provided $v_{1},\dots,v_{d}\neq 0$ . Now we define functional spaces. For $p\in[1,\infty)$ and $k\in\mathbb{N}$ , $L^{p}(\mathbb{R}^{k})$ denotes the set of all measurable functions $g:\mathbb{R}^{k}\rightarrow\mathbb{C}$ such that $\int_{\mathbb{R}^{k}}|g|^{p}<\infty$ . For $g$ in either $L^{1}(\mathbb{R}^{k})$ or $L^{2}(\mathbb{R}^{k})$ , the Fourier transform and the inverse Fourier transform are respectively defined as

\mathcal{F}(g)(\cdot)=\int_{\mathbb{R}^{k}}g(\boldsymbol{x})\exp(i\boldsymbol{x}^{\top}\cdot)d\boldsymbol{x}\quad\text{and}\quad\mathcal{F}^{-1}(g)(\cdot)=(2\pi)^{-k}\int_{\mathbb{R}^{k}}g(\boldsymbol{x})\exp(-i\boldsymbol{x}^{\top}\cdot)d\boldsymbol{x}.

Throughout this article, let $X$ be a simple spatial point process defined on $\mathbb{R}^{d}$ . Then, the $n$ th-order intensity function (also known as the product density function) of $X$ , denoted as $\lambda_{n}:\mathbb{R}^{nd}\rightarrow[0,\infty)$ , satisfies the following identity

\mathbb{E}\bigg{[}\sum_{(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})\in X^{n}\cap(\mathbb{R}^{d})^{n,\neq}}q(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})\bigg{]}=\int q(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})\lambda_{n}(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})\prod_{j=1}^{n}d\boldsymbol{x}_{j}

(2.1)

for any positive measurable function $q:\mathbb{R}^{nd}\rightarrow[0,\infty)$ . Next, we define the cumulant intensity functions. For $n\in\mathbb{N}$ , let $S_{n}$ be the set of all partitions of $\{1,\dots,n\}$ and for $B=\{i_{1},\dots,i_{m}\}\subseteq\{1,\dots,n\}$ ( $m\leq n$ ), let $\lambda_{n(B)}(\boldsymbol{x}_{B})=\lambda_{m}(\boldsymbol{x}_{i_{1}},\dots,\boldsymbol{x}_{i_{m}})$ . Then, the $n$ th-order cumulant intensity function (cf. Brillinger (1981), Chapter 2.3) of $X$ is defined as

\gamma_{n}(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})=\sum_{\pi\in S_{n}}(n(\pi)-1)!(-1)^{n(\pi)-1}\prod_{B\in\pi}\lambda_{n(B)}(\boldsymbol{x}_{B}),\quad\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n}\in\mathbb{R}^{d}.

(2.2)

2.2 Spectral density function and its estimator

From now onwards, we assume that $X$ is a $k$ th-order stationary ( $k\geq 2$ ) point process. An extension to the nonstationary point process case will be discussed in Section I. Under the $k$ th-order stationarity, we can define the $n$ th-order reduced intensity functions as follows:

\lambda_{n}(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})=\lambda_{n,\text{red}}(\boldsymbol{x}_{1}-\boldsymbol{x}_{n},\dots,\boldsymbol{x}_{n-1}-\boldsymbol{x}_{n}),\quad n\in\{1,\dots,k\}.

(2.3)

The $n$ th-order reduced cumulant intensity function, denoted as $\gamma_{n,\text{red}}$ , is defined similarly, but replacing $\lambda_{n}$ with $\gamma_{n}$ in (2.3). In particular, when $n=1$ , we use the common notation $\lambda_{1,\text{red}}=\gamma_{1,\text{red}}=\lambda$ and refer to it as the (constant) first-order intensity.

Next, the complete covariance function of $X$ at two locations $\boldsymbol{x}_{1},\boldsymbol{x}_{2}\in\mathbb{R}^{d}$ (which are not necessarily distinct) in the sense of Bartlett (1964) is defined as

C(\boldsymbol{x}_{1}-\boldsymbol{x}_{2})=\lambda\delta(\boldsymbol{x}_{1}-\boldsymbol{x}_{2})+\gamma_{2,\text{red}}(\boldsymbol{x}_{1}-\boldsymbol{x}_{2}),

(2.4)

where $\delta(\cdot)$ is the Dirac-delta function. Heuristically, $C(\boldsymbol{x}_{1}-\boldsymbol{x}_{2})d\boldsymbol{x}_{1}d\boldsymbol{x}_{2}$ is the covariance density of $N_{X}(d\boldsymbol{x}_{1})$ and $N_{X}(d\boldsymbol{x}_{2})$ , where $N_{X}(\cdot)$ is the counting measure induced by $X$ and $d\boldsymbol{x}$ is an infinitesimal region in $\mathbb{R}^{d}$ that contains $\boldsymbol{x}$ . Provided that $\gamma_{2,\text{red}}\in L^{1}(\mathbb{R}^{d})$ , we can define the non-negative valued spectral density function of $X$ by the inverse Fourier transform of $C(\cdot)$ as

f(\boldsymbol{\omega})=(2\pi)^{-d}\int_{\mathbb{R}^{d}}C(\boldsymbol{s})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x}=(2\pi)^{-d}\lambda+\mathcal{F}^{-1}(\gamma_{2,\text{red}})(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(2.5)

Here, we use (2.4) in the second identity. See Daley and Vere-Jones (2003), Sections 8.2 for the mathematical construction of Bartlett’s spectral density function.

To estimate the spectral density function, we assume that the point process $X$ is observed within a compact domain (window) $D_{n}\subset\mathbb{R}^{d}$ of the form

D_{n}=[-A_{1}/2,A_{1}/2]\times\dots[-A_{d}/2,A_{d}/2],\quad n\in\mathbb{N},

(2.6)

where for $i\in\{1,\dots,d\}$ , $\{A_{i}=A_{i}(n)\}_{n=1}^{\infty}$ is an increasing sequence of positive numbers. Now, we define the DFT of the observed point pattern that incorporates data tapering—a commonly used approach to mitigate the bias inherent in the periodogram (Tukey (1967)). Let $h(\cdot)$ be a non-negative data taper on $\mathbb{R}^{d}$ with compact support $[-1/2,1/2]^{d}$ . For a domain $D_{n}$ of form (2.6), let

H_{h,k}^{(n)}(\boldsymbol{\omega})=\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{k}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x},\quad k\in\mathbb{N},\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(2.7)

where $h(\boldsymbol{x}/\boldsymbol{A})=h(x_{1}/A_{1},\dots,x_{d}/A_{d})$ . Let $H_{h,k}=\int_{[-1/2,1/2]^{d}}h(\boldsymbol{x})^{k}d\boldsymbol{x}$ , $k\in\mathbb{N}$ . Throughout the article, we assume $H_{h,k}>0$ , $k\in\mathbb{N}$ . Using these notation, the DFT incorporating data taper $h$ is defined as

\mathcal{J}_{h,n}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(2.8)

where $|D_{n}|$ denotes the volume of $D_{n}$ . We note that by setting $h(\boldsymbol{x})=1$ on $[-1/2,1/2]^{d}$ , the tapered DFT above encompasses the non-tapered DFT

\mathcal{J}_{n}(\boldsymbol{\omega})=(2\pi)^{-d/2}|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(2.9)

Unless otherwise specified, we will use the term “DFT” to indicate the tapered DFT defined as in (2.8). Unlike the classical setting in time series or random fields, the DFT is not centered. By applying (2.1), it can be easily seen that $\mathbb{E}[\mathcal{J}_{h,n}(\boldsymbol{\omega})]=\lambda c_{h,n}(\boldsymbol{\omega})$ , where

c_{h,n}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}H_{h,1}^{(n)}(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(2.10)

is the bias factor. Therefore, the centered DFT is defined as

J_{h,n}(\boldsymbol{\omega})=\mathcal{J}_{h,n}(\boldsymbol{\omega})-\mathbb{E}[\mathcal{J}_{h,n}(\boldsymbol{\omega})]=\mathcal{J}_{h,n}(\boldsymbol{\omega})-\lambda c_{h,n}(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(2.11)

To estimate the unknown first-order intensity, the feasible criterion of $J_{h,n}(\cdot)$ becomes

\widehat{J}_{h,n}(\boldsymbol{\omega})=\mathcal{J}_{h,n}(\boldsymbol{\omega})-\widehat{\lambda}_{h,n}c_{h,n}(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(2.12)

where $\widehat{\lambda}_{h,n}=H_{h,1}^{-1}|D_{n}|^{-1}\sum_{\boldsymbol{x}\in X\cap D_{n}}h(\boldsymbol{x}/\boldsymbol{A})$ ( $n\in\mathbb{N}$ ) is an unbiased estimator of $\lambda$ .

Finally, we define the periodogram and its feasible criterion respectively by

\displaystyle I_{h,n}(\boldsymbol{\omega})=|J_{h,n}(\boldsymbol{\omega})|^{2}\quad\hbox{and}\quad\widehat{I}_{h,n}(\boldsymbol{\omega})=|\widehat{J}_{h,n}(\boldsymbol{\omega})|^{2},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(2.13)

2.3 Features of the spectral density functions and their estimators

To motivate the spectral approaches for spatial point processes, the top panel of Figure 1 display four spatial point patterns on the observation domain $[-20,20]^{2}$ . These patterns are generated from four different stationary isotropic point process models, exhibiting clustering behaviors in realizations A and B but repulsive behaviors in realizations C and D. All four models share the same first-order intensity, set at 0.5. In the middle panel of Figure 1, we plot the pair correlation function (PCF; middle left) and spectral density function (middle right) for each process.

Refer to caption — Figure 1: Top: Realizations of the four different stationary isotropic spatial point processes on the observation domain $[-20,20]^{2}$ . Middle left: Plot of the pair correlation function $g(\boldsymbol{x})-1$ against $\|\boldsymbol{x}\|\in[0,\infty)$ for each model. Middle right: Plot of the spectral density function $f(\boldsymbol{\omega})$ in log-scale against $\|\boldsymbol{\omega}\|\in[0,\infty)$ for each model. Bottom: Plot of the periodogram $\widehat{I}_{h,n}(\boldsymbol{\omega})$ .

We now investigate how the features of the spatial point patterns are reflected in the spectral density functions. Since the PCF $g(\boldsymbol{x})=\gamma_{2,\text{red}}(\boldsymbol{x})/\lambda^{2}+1$ , by using (2.5), we have

f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda=\lambda^{2}\mathcal{F}^{-1}(g-1)(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{2}.

(2.14)

Given the uniqueness of the Fourier transform, the information contained in the spectral density function can be fully recovered from the first-order intensity and the PCF, and vice versa. However, while the PCF $g(\boldsymbol{x})$ captures only ”local” information of the point process at a certain lag $\boldsymbol{x}$ , the spectral density function encapsulates ”global” information of the point process, including the first-order intensity (at high frequencies) and overall clustering/repulsive behavior (at low frequencies).

High frequency information. Assuming $g-1\in L^{1}(\mathbb{R}^{2})$ (which is equivalent to the Assumption 3.2 for $\ell=2$ below), (2.14) implies $\lim_{\|\boldsymbol{\omega}\|\rightarrow\infty}f(\boldsymbol{\omega})=(2\pi)^{-2}\lambda$ . Therefore, at high frequencies, the spectral density function contains information about the first-order intensity.

Low frequency information. Note, $g(\boldsymbol{x})-1>0$ (resp. $<0$ ) implies the clustering (resp. repulsive) behavior at the fixed lag $\boldsymbol{x}\in\mathbb{R}^{2}$ . In the frequency domain, by using (2.14), we have $f(\boldsymbol{\omega})-(2\pi)^{-2}\lambda\approx f(\textbf{0})-(2\pi)^{-2}\lambda=(2\pi)^{-2}\lambda^{2}\int_{\mathbb{R}^{2}}(g(\boldsymbol{x})-1)d\boldsymbol{x}$ for $\|\boldsymbol{\omega}\|\approx\textbf{0}$ . Therefore, the spectral density function evaluated at low frequencies above (resp. below) the asymptote indicates the “overall” clustering (resp. repulsive) behavior of the point process.

Rate of convergence. Comparing the clustered or repulsive realizations, realization B is more clustered than A while realization D is more repulsive than C. Reflected from the PCFs of the associated models, the PCF of model B is larger at small lag distances but drops more rapidly as the lag distance increases when compared to that of model A, while the PCF of model D is smaller than that of model C. In the frequency domain, one can also extract information on the quality of the clustering and repulsive behaviors. Since the decaying rate of the Fourier transform is related to the smoothness of the original function (cf. Folland (1999), Theorem 8.22), the faster (resp. slower) convergence of the spectral density to the asymptote implies a smoother (resp. rougher) PCF.

Properties of the periodogram. Lastly, we discuss properties of the periodogram as a raw estimator of the spectral density function. The bottom panel of Figure 1 plots the periodograms $\widehat{I}_{h,n}(\boldsymbol{\omega})$ for realizations A–D. Using a computational method described in Appendix H.2, it takes less than 0.06 seconds to evaluate the periodograms on a grid of frequencies $\{(2\pi k_{1}/40,2\pi k_{2}/40):k_{1},k_{2}\in\{-40,\dots,40\}\}$ for each model. We observe that for all realizations, the periodograms follow the trend of the corresponding spectral density functions. However, the periodograms are very noisy, indicating that uncorrelated noise fluctuations are added to the trend. We rigorously investigate theoretical properties of the periodograms in Theorem 3.1 below. To obtain a consistent estimator of the spectral density function, one can locally smooth the periodogram. Detailed computations of the smoothed periodogram with illustrations can be found in Appendix H.1 and the theoretical results for the smoothed periodogram are presented in Sections 3.2 and 4.

3 Asymptotic properties of the DFT and periodogram

3.1 Asymptotic results

In this section, we investigate asymptotic properties of the DFT and periodogram. To do so, we require the following sets of assumptions.

The first assumption is on the increasing-domain asymptotic framework.

Assumption 3.1.

Let $D_{n}$ ( $n\in\mathbb{N}$ ) be a sequence of increasing windows of form (2.6) with $\lim_{n\rightarrow\infty}|D_{n}|=\infty$ . Moreover, $D_{n}$ grows with the same speed in all coordinates of $\mathbb{R}^{d}$ :

A_{i}/A_{j}=O(1),\quad n\rightarrow\infty,\quad i,j\in\{1,\dots,d\}.

(3.1)

The next assumption is on the higher-order cumulants of $X$ .

Assumption 3.2.

Let $\ell\in\{2,3,\dots\}$ be fixed. For $n\in\{1,\dots,\ell\}$ , the cumulant density function $\gamma_{n}$ in (2.2) is well-defined and

\sup_{\boldsymbol{x}_{n}\in\mathbb{R}^{d}}\int_{\mathbb{R}^{d(n-1)}}\left|\gamma_{n}(\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n})\right|d\boldsymbol{x}_{1}\cdots d\boldsymbol{x}_{n-1}<\infty,\quad n\in\{2,\dots,\ell\}.

(3.2)

For an $\ell$ th-order stationary process, Assumption 3.2 can be equivalently expressed as $\gamma_{n,\text{red}}\in L^{1}(\mathbb{R}^{d(n-1)})$ , $n\in\{2,\dots,\ell\}$ .

The next assumption concerns the $\alpha$ -mixing coefficient of $X$ that was first introduced by Rosenblatt (1956). For compact and convex subsets $E_{i},E_{j}\subset\mathbb{R}^{d}$ , let $d(E_{i},E_{j})=\inf\{\|\boldsymbol{x}_{i}-\boldsymbol{x}_{j}\|_{\infty}:\boldsymbol{x}_{i}\in E_{i},\boldsymbol{x}_{j}\in E_{j}\}$ . Then, for $p,q,k\in(0,\infty)$ , the $\alpha$ -mixing coefficient of $X$ is defined as

	$\displaystyle\alpha_{p,q}(k)$	$\displaystyle=\sup_{A_{i},A_{j},E_{i},E_{j}}\Bigl{\{}\left\|P(A_{i}\cap A_{j})-P(A_{i})P(A_{j})\right\|:A_{i}\in\mathcal{F}(E_{i}),A_{j}\in\mathcal{F}(E_{j}),$		(3.3)
		$\displaystyle\qquad\qquad\quad\|E_{i}\|\leq p,\|E_{j}\|\leq q,d(E_{i},E_{j})\geq k\Bigr{\}},$		(3.3)

where $\mathcal{F}(E)$ denotes the $\sigma$ -field generated by $X$ in $E\subset\mathbb{R}^{d}$ .

Assumption 3.3.

Let $\alpha_{p,q}(k)$ be the $\alpha$ -mixing coefficient of $X$ defined in (3.3). We assume one of the following two conditions.

(i)

There exists $\varepsilon>0$ such that $\sup_{p\in(0,\infty)}\alpha_{p,p}(k)/\max(p,1)=O(k^{-d-\varepsilon})$ as $k\rightarrow\infty$ .
(ii)

There exists $\varepsilon>2d$ such that $\sup_{p\in(0,\infty)}\alpha_{p,p}(k)/\max(p,1)=O(k^{-d-\varepsilon})$ as $k\rightarrow\infty$ .

The last set of assumptions is on the data taper.

Assumption 3.4.

The data taper $h(\boldsymbol{x})$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ , is non-negative and has a compact support on $[-1/2,1/2]^{d}$ . Moreover, $h$ satisfies one of the following two conditions below.

(i)

$h$ is continuous on $[-1/2,1/2]^{d}$ .
(ii)

Let $m\in\mathbb{N}$ be fixed. For $\boldsymbol{\alpha}\in\{0,1,\dots\}^{d}$ with $1\leq|\boldsymbol{\alpha}|\leq m$ , $\partial^{\boldsymbol{\alpha}}h$ exists and is continuous on $\mathbb{R}^{d}$ .

Assumption 3.4(i) encompasses the non-taper case, i.e., $h(\boldsymbol{x})=1$ on $[-1/2,1/2]^{d}$ . Assumption 3.4(ii) for $m=d+1$ is used to show the $|D_{n}|^{1/2}$ -asymptotic normality of the integrated periodogram. To be more precise, it is required to show the Fourier transform of $h$ is absolutely integrable on $\mathbb{R}^{d}$ . As an alternative condition, one can use a slightly different condition:

\partial^{\boldsymbol{\alpha}}h

exists for any

\boldsymbol{\alpha}=(\alpha_{1},\dots,\alpha_{d})\in\{0,1,2\}^{d}

(3.4)

Our theoretical results remain unchanged when substituting Assumption 3.4(ii) for $m=d+1$ with (3.4). An example of a data taper that satisfies (3.4) is provided in (7.1). According to our simulation results in Section 7, the choice of data taper does not seem to affect the performance of the periodogram-based estimator.

Using the aformentioned sets of assumptions, we now establish the asymptotic joint distribution of the feasible criteria of the DFTs and periodograms. Recall (2.10) and (2.12). It is easily seen that $\widehat{J}_{h,n}(\textbf{0})=0$ . Consequently, we exclude the frequency at the origin. Next, we introduce the concept of asymptotically distant frequencies by Bandyopadhyay and Lahiri (2009). For two sequences of frequencies $\{\boldsymbol{\omega}_{1,n}\}_{n=1}^{\infty}$ and $\{\boldsymbol{\omega}_{2,n}\}_{n=1}^{\infty}$ on $\mathbb{R}^{d}$ , we say $\{\boldsymbol{\omega}_{1,n}\}$ and $\{\boldsymbol{\omega}_{2,n}\}$ are asymptotically distant if

\lim_{n\rightarrow\infty}|D_{n}|^{1/d}\|\boldsymbol{\omega}_{1,n}-\boldsymbol{\omega}_{2,n}\|=\infty.

Now, we compute the limit of $\mathrm{cov}(\widehat{J}_{h,n}(\boldsymbol{\omega}_{1,n}),\widehat{J}_{h,n}(\boldsymbol{\omega}_{2,n}))$ and $\mathrm{cov}(\widehat{I}_{h,n}(\boldsymbol{\omega}_{1,n}),\widehat{I}_{h,n}(\boldsymbol{\omega}_{2,n}))$ for two asymptotically distant frequencies.

Theorem 3.1 (Asymptotic uncorrelatedness of the DFT and periodogram).

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=2$ ), and 3.4(i) hold. Let $\{\boldsymbol{\omega}_{1,n}\}$ and $\{\boldsymbol{\omega}_{2,n}\}$ be sequences on $\mathbb{R}^{d}$ such that $\{\boldsymbol{\omega}_{1,n}\}$ , $\{\boldsymbol{\omega}_{2,n}\}$ , and $\{\textbf{0}\}$ are pairwise asymptotically distant. Moreover, let $\{\boldsymbol{\omega}_{n}\}$ be a sequence that is asymptotically distant from $\{\textbf{0}\}$ and converges to the fixed frequency $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Then,

	$\displaystyle\lim_{n\rightarrow\infty}\mathrm{cov}(\widehat{J}_{h,n}(\boldsymbol{\omega}_{1,n}),\widehat{J}_{h,n}(\boldsymbol{\omega}_{2,n}))=0.$		(3.5)
	$\displaystyle\text{and}\quad\lim_{n\rightarrow\infty}\mathrm{var}(\widehat{J}_{h,n}(\boldsymbol{\omega}_{n}))=f(\boldsymbol{\omega}).$		(3.6)

If we further assume Assumption 3.2 for $\ell=4$ holds and $\{\boldsymbol{\omega}_{1,n}\}$ and $\{-\boldsymbol{\omega}_{2,n}\}$ are asymptotically distant, then

\lim_{n\rightarrow\infty}\mathrm{cov}(\widehat{I}_{h,n}(\boldsymbol{\omega}_{1,n}),\widehat{I}_{h,n}(\boldsymbol{\omega}_{2,n}))=0\quad\text{and}\quad\lim_{n\rightarrow\infty}\mathrm{var}(\widehat{I}_{h,n}(\boldsymbol{\omega}_{n}))=f(\boldsymbol{\omega})^{2}.

(3.7)

Proof.

See Appendix B.1. ∎

By using the aforementioned moment properties in conjunction with the $\alpha$ -mixing condition, we derive the asymptotic joint distribution of the DFTs and periodograms.

Theorem 3.2 (Asymptotic joint distribution of the DFTs and periodograms).

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), 3.3(i), and 3.4(i) hold. For a fixed $r\in\mathbb{N}$ , $\{\boldsymbol{\omega}_{1,n}\}$ , …, $\{\boldsymbol{\omega}_{r,n}\}$ denote $r$ sequences on $\mathbb{R}^{d}$ that satisfy the following three conditions: for $(i,j)\in\{1,\dots,r\}^{2,\neq}$ ,

(1)

$\lim_{n\rightarrow\infty}\boldsymbol{\omega}_{i,n}=\boldsymbol{\omega}_{i}\in\mathbb{R}^{d}$ .
(2)

$\{\boldsymbol{\omega}_{i,n}\}$ is asymptotically distant from $\{\textbf{0}\}$ .
(3)

$\{\boldsymbol{\omega}_{i,n}+\boldsymbol{\omega}_{j,n}\}$ and $\{\boldsymbol{\omega}_{i,n}-\boldsymbol{\omega}_{j,n}\}$ are asymptotically distant from $\{\textbf{0}\}$ .

Then, we have

\left(\frac{\widehat{J}_{h,n}(\boldsymbol{\omega}_{1,n})}{(\frac{1}{2}f(\boldsymbol{\omega}_{1}))^{1/2}},\dots,\frac{\widehat{J}_{h,n}(\boldsymbol{\omega}_{r,n})}{(\frac{1}{2}f(\boldsymbol{\omega}_{r}))^{1/2}}\right)\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}(Z_{1},\dots,Z_{r}),\quad n\rightarrow\infty,

where $\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}$ denotes weak convergence and $\{Z_{k}\}_{k=1}^{r}$ are independent standard normal random variables on $\mathbb{C}$ . Therefore, by using the continunous mapping theorem, we have

\left(\frac{\widehat{I}_{h,n}(\boldsymbol{\omega}_{1,n})}{\frac{1}{2}f(\boldsymbol{\omega}_{1})},\dots,\frac{\widehat{I}_{h,n}(\boldsymbol{\omega}_{r,n})}{\frac{1}{2}f(\boldsymbol{\omega}_{r})}\right)\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}(\chi^{2}_{1},\dots,\chi^{2}_{r}),\quad n\rightarrow\infty,

where $\{\chi^{2}_{k}\}_{k=1}^{r}$ are independent chi-squared random variables with degrees of freedom two.

Proof.

See Appendix B.2. ∎

Remark 3.1.

The limit frequencies $\boldsymbol{\omega}_{1},\dots,\boldsymbol{\omega}_{r}\in\mathbb{R}^{d}$ need not be distinct nor nonzero, as long as the sequences $\{\boldsymbol{\omega}_{1,n}\},\dots,\{\boldsymbol{\omega}_{r,n}\}$ satisfy the conditions (1)–(3) in Theorem 3.2.

3.2 Nonparametric kernel spectral density estimator

We observe from Theorem 3.1 that $\lim_{n\rightarrow\infty}\mathrm{var}(\widehat{I}_{h,n}(\boldsymbol{\omega}))=f(\boldsymbol{\omega})^{2}>0$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}\backslash\{\textbf{0}\}$ . Therefore, the periodogram is an inconsistent estimator of the spectral density function. In this section, we obtain a consistent estimator of the spectral density function via periodogram smoothing.

Let $W:\mathbb{R}^{d}\rightarrow\mathbb{R}$ be a positive continuous and symmetric kernel function with compact support on $[-1/2,1/2]^{d}$ , satisfying $\int_{\mathbb{R}^{d}}W(\boldsymbol{x})d\boldsymbol{x}=1$ and $\int_{\mathbb{R}^{d}}W(\boldsymbol{x})^{2}d\boldsymbol{x}<\infty$ . For a bandwidth $\boldsymbol{b}=(b_{1},\dots,b_{d})^{\top}\in(0,\infty)^{d}$ , let $W_{\boldsymbol{b}}(\boldsymbol{x})=(b_{1}\cdots b_{d})^{-1}W(\boldsymbol{x}/\boldsymbol{b})$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ . For ease of presentation, we set $b_{1}=\cdots=b_{d}=b\in(0,\infty)$ . Thus, we write $W_{b}(\boldsymbol{x})=W_{\boldsymbol{b}}(\boldsymbol{x})=b^{-d}W(b^{-1}\boldsymbol{x})$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ . Now, we define the kernel spectral density estimator $\widehat{f}_{n,b}(\boldsymbol{\omega})$ by

\widehat{f}_{n,b}(\boldsymbol{\omega})=\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})\widehat{I}_{h,n}(\boldsymbol{x})d\boldsymbol{x},\quad n\in\mathbb{N},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(3.8)

Below, we show that $\widehat{f}_{n,b}$ consistently estimates the spectral density function.

Theorem 3.3.

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), and 3.4(i) hold. Moreover, the bandwidth $b=b(n)$ is such that $\lim_{n\rightarrow\infty}b(n)+|D_{n}|^{-1}b(n)^{-d}=0$ . Then, for $\boldsymbol{\omega}\in\mathbb{R}^{d}$ ,

\widehat{f}_{n,b}(\boldsymbol{\omega})\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}f(\boldsymbol{\omega}),\quad n\rightarrow\infty,

where $\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}$ denotes convergence in probability.

Proof.

See Appendix B.3. ∎

4 Estimation of the spectral mean statistics

Let $D$ be a prespecified compact region on $\mathbb{R}^{d}$ that does not depend on the index $n\in\mathbb{N}$ . For a real continuous function $\phi(\cdot)$ on $D$ , our goal is to estimate parameter written in terms of the the spectral mean on $D$ :

A(\phi)=\int_{D}\phi(\boldsymbol{\omega})f(\boldsymbol{\omega})d\boldsymbol{\omega}.

(4.1)

A natural estimator of $A(\phi)$ is the integrated periodogram

\widehat{A}_{h,n}(\phi)=\int_{D}\phi(\boldsymbol{\omega})\widehat{I}_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega},\quad n\in\mathbb{N}.

(4.2)

We briefly mention two examples of estimation problems for spatial point processes that fall under the above framework. First, let $\phi(\cdot)=\phi_{b}(\cdot)=W_{b}(\boldsymbol{\omega}-\cdot)$ , where $W_{b}$ is a kernel function with bandwidth $b\in(0,\infty)$ as in Section 3.2. Since $f(\boldsymbol{\omega})$ is locally constant in a small neighborhood of $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , as $b=b(n)\rightarrow 0$ , we have $A(\phi_{b})\approx f(\boldsymbol{\omega})$ . Thus, $\widehat{f}_{n,b}(\boldsymbol{\omega})=\widehat{A}_{h,n}(\phi_{b})$ , as in (3.8), is our non-parametric estimator of the spectral density. Second, let $\phi(\cdot)=\phi(\cdot;\boldsymbol{\theta})=f_{\boldsymbol{\theta}}^{-1}(\cdot)$ , where $\{f_{\boldsymbol{\theta}}\}$ is a family of spectral density functions with parameter $\boldsymbol{\theta}\in\Theta$ . Then, $A(\phi(\cdot;\boldsymbol{\theta}))+\int_{D}\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega})d\boldsymbol{\omega}$ denotes the spectral divergence between $f$ and $f_{\boldsymbol{\theta}}$ . An estimator of the spectral divergence is given by $\widehat{A}_{h,n}(\phi(\cdot;\boldsymbol{\theta}))+\int_{D}\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega})d\boldsymbol{\omega}$ , which we refer to as the Whittle likelihood. Please see Section 6 for further details on the Whittle likelihood and the resulting model parameter estimation.

To derive the asymptotic properties of the integrated periodogram, we note that the variance expression of $\widehat{A}_{h,n}(\phi)$ involves the fourth-order cumulant term of $X$ . To obtain a simple limiting variance expression of $\widehat{A}_{h,n}(\phi)$ , we assume that $X$ is fourth-order stationary, thus both $\lambda_{n,\text{red}}$ and $\gamma_{n,\text{red}}$ in (2.3) are well-defined for $n\in\{1,2,3,4\}$ . Following an argument similar to Bartlett (1964), we introduce the complete fourth-order reduced cumulant density function, denoted as $\kappa_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})$ . Heuristically, this function is defined as a cumulant density function of $N_{X}(d\boldsymbol{t}_{1})$ , $N_{X}(d\boldsymbol{t}_{2})$ , $N_{X}(d\boldsymbol{t}_{3})$ , and $N_{X}(d\boldsymbol{t}_{4})$ , where $\boldsymbol{t}_{1},\dots,\boldsymbol{t}_{4}\in\mathbb{R}^{d}$ may not necessarily be distinct. Explicitly, $\kappa_{4,\text{red}}(\cdot,\cdot,\cdot)$ can be written as a sum of reduced cumulant intensity functions of orders up to four. See (D.14) in the Appendix for a precise expression. Therefore, under Assumption 3.2 for $\ell=4$ , it can be easily seen that $\kappa_{4,\text{red}}\in L^{1}(\mathbb{R}^{3d})$ , in turn, the fourth-order spectral density of $X$ can be defined as an inverse Fourier transform of $\kappa_{4,\text{red}}$

f_{4}(\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{\omega}_{3})=(2\pi)^{-3d}\int_{\mathbb{R}^{3d}}\exp\left(-i\sum_{i=1}^{3}\boldsymbol{\omega}_{i}^{\top}\boldsymbol{x}_{i}\right)\kappa_{4,\text{red}}(\boldsymbol{x}_{1},\boldsymbol{x}_{2},\boldsymbol{x}_{3})d\boldsymbol{x}_{1}d\boldsymbol{x}_{2}d\boldsymbol{x}_{3}

(4.3)

for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{\omega}_{3}\in\mathbb{R}^{d}$ . We now introduce the following assumptions on $f$ and $f_{4}$ . For $\boldsymbol{\alpha}=(\alpha_{1},\dots,\alpha_{d})\in\{0,1,\dots\}^{d}$ , let $\partial^{\boldsymbol{\alpha}}=\left(\partial/\partial\omega_{1}\right)^{\alpha_{1}}\cdots\left(\partial/\partial\omega_{d}\right)^{\alpha_{d}}$ be the $\boldsymbol{\alpha}$ th-order partial derivative.

Assumption 4.1.

Suppose the spectral density function $f(\boldsymbol{\omega})$ of $X$ is well-defined for all $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Moreover, $f$ satisfies the following: (i) $f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda\in L^{1}(\mathbb{R}^{d})$ and (ii) for $\boldsymbol{\alpha}\in\{0,1,2\}^{d}$ with $|\boldsymbol{\alpha}|=2$ , $\partial^{\boldsymbol{\alpha}}f(\boldsymbol{\omega})$ exists for $\boldsymbol{\omega}\in\mathbb{R}^{d}$ and $\sup_{\boldsymbol{\omega}}\left|\partial^{\boldsymbol{\alpha}}f(\boldsymbol{\omega})\right|<\infty$ .

We make two remarks on the above assumptions. Firstly, we observe from (2.5) that $f$ is not absolutely integrable. Instead, given Assumption 4.1(i), the spectral density function, when appropriately “shifted”, admits the Fourier transformation

\gamma_{2,\text{red}}(\boldsymbol{x})=\mathcal{F}(f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda)=\int_{\mathbb{R}^{d}}\left\{f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda\right\}\exp(i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{\omega}.

(4.4)

Secondly, for some parametric models (e.g., the log-Gaussian Cox processes in Section 5.3 below), a closed-form expression for $f$ is not available, while $\gamma_{2,\text{red}}$ has an analytic form. In this case, a sufficient condition for Assumption 4.1(i) to hold is that $\gamma_{2,\text{red}}$ has continuous partial derivatives up to order $(d+1)$ , as per Folland (1999), Theorem 8.22 (see also page 257 of the same reference). Moreover, since $(\partial^{2}f/\partial\omega_{i}\partial\omega_{j})(\boldsymbol{\omega})=-(2\pi)^{-d}\int_{\mathbb{R}^{d}}x_{i}x_{j}\gamma_{2,\text{red}}(\boldsymbol{x})\exp(-i\boldsymbol{\omega}^{\top}\boldsymbol{x})d\boldsymbol{x}$ , Assumption 4.1(ii) holds if $|\boldsymbol{x}|^{2}\gamma_{2,\text{red}}(\boldsymbol{x})\in L^{1}(\mathbb{R}^{d})$ .

Assumption 4.2.

Suppose that $f_{4}(\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{\omega}_{3})$ , is well-defined for all $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{\omega}_{3}\in\mathbb{R}^{d}$ . Moreover, $f_{4}$ satisfies the following: (i) $f_{4}-(2\pi)^{-3d}\lambda\in L^{1}(\mathbb{R}^{3d})$ and (ii) $f_{4}(\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{\omega}_{3})$ is twice partial differentiable with respect to $\boldsymbol{\omega}_{2}$ and the second-order partial derivative is bounded above.

By using the same auguments above, sufficient conditions for Assumption 4.2 to hold in terms of the differentiability and integrability of $\gamma_{n,\text{red}}$ ( $n\in\{2,3,4\}$ ) also can be easily derived.

Now, we are ready to state our main theorem addressing the asymptotic normality of $\widehat{A}_{h,n}(\phi)$ .

Theorem 4.1 (Asymptotic distribution of the integrated periodogram).

Let $X$ be a fourth-order stationary point process on $\mathbb{R}^{d}$ , that is, (2.3) is satisfied for $k=4$ . Then, the following three assertions hold.

(i)

Suppose that Assumptions 3.1, 3.2 (for $\ell=2$ ), 3.4(ii) (for $m=1$ ), and 4.1(ii) hold. Then,

$\mathbb{E}[\widehat{A}_{h,n}(\phi)]=A(\phi)+O(|D_{n}|^{-2/d}),\quad n\rightarrow\infty.$

(ii)

Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), 4.1(i), and 4.2 hold. Furthermore, the data taper $h$ is constant on $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(ii) for $m=d+1$ . Then,

\lim_{n\rightarrow\infty}|D_{n}|\mathrm{var}(\widehat{A}_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2}),

where

	$\displaystyle\Omega_{1}$	$\displaystyle=\int_{D}\phi(\boldsymbol{\omega})\left(\phi(\boldsymbol{\omega})+\phi(-\boldsymbol{\omega})\right)f(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}$		(4.5)
	$\displaystyle\text{and}\quad\Omega_{2}$	$\displaystyle=\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})f_{4}(\boldsymbol{\omega}_{1},-\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$		(4.5)

(iii)

Now, let $d\in\{1,2,3\}$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=8$ ), 3.3(ii), 3.4(ii) (for $m=d+1$ ), 4.1, and 4.2 hold. Then,

|D_{n}|^{1/2}(\widehat{A}_{h,n}(\phi)-A(\phi))\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(0,(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2})\right),\quad n\rightarrow\infty.

Proof.

See Appendix A. ∎

Remark 4.1 (Estimation of the asymptotic variance).

Since $\Omega_{1}$ and $\Omega_{2}$ above are unknown functions of the spectral density and fourth-order spectral density function, the asymptotic variance of $\widehat{A}_{h,n}(\phi)$ needs to be estimated. In Appendix G, we provide details on the estimation of $(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2})$ using the subsampling method.

As a direct application of the above theorem, we derive the asymptotic distribution of the kernel spectral density estimator. Recall (3.8).

Theorem 4.2.

Let $X$ be a fourth-order stationary point process on $\mathbb{R}^{d}$ , that is, (2.3) is satisfied for $k=4$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=8$ ), 3.3(ii), 3.4(ii) (for $m=d+1$ ), 4.1, and 4.2 hold. Moreover, the bandwidth $b=b(n)$ is such that

\lim_{n\rightarrow\infty}b(n)+|D_{n}|^{-1}b(n)^{-d}=0\text{~{}~{}and~{}~{}}\lim_{n\rightarrow\infty}|D_{n}|^{1/2}b(n)^{d/2}(|D_{n}|^{-2/d}+b(n)^{2})=0.

Let $W_{2}=\int_{\mathbb{R}^{d}}W(\boldsymbol{x})^{2}d\boldsymbol{x}$ . Then, for $\boldsymbol{\omega}\in\mathbb{R}^{d}\backslash\{\textbf{0}\}$ ,

\sqrt{|D_{n}|b^{d}}\big{(}\widehat{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega})\big{)}\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(0,(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})W_{2}f(\boldsymbol{\omega})^{2}\right),\quad n\rightarrow\infty

and for $\boldsymbol{\omega}=\textbf{0}$ ,

\sqrt{|D_{n}|b^{d}}\big{(}\widehat{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega})\big{)}\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(0,2(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})W_{2}f(\boldsymbol{\omega})^{2}\right),\quad n\rightarrow\infty.

Proof.

See Appendix B.4. ∎

5 Examples of spatial point process models

In this section, we provide examples of four widely used stationary spatial point process models and specify the conditions under which each model satisfies Assumptions 3.2, 3.3, and 4.1, which are required to establish the asymptotic results in Sections 3 and 4.

5.1 Example I: Hawkes processes

The Hawkes process is a doubly stochastic process on $\mathbb{R}$ characterized by self-exciting and clustering properties (Hawkes (1971a, b)). A stationary Hawkes process is described by a conditional intensity function of the form $\lambda(t)=\nu+\int_{-\infty}^{t}\eta(t-u)N_{X}(du)$ , $t\in\mathbb{R}$ . Here, $\nu>0$ is the immigration intensity and $\eta:[0,\infty)\rightarrow[0,\infty)$ is a measurable function satisfying $\int_{\mathbb{R}}\eta(u)du<1$ , referred to as the reproduction function. The spectral density function of the Hawkes process, as stated in Daley and Vere-Jones (2003), Example 8.2(e), is given by $f(\omega)=(2\pi)^{-1}\lambda\left|1-\mathcal{F}(\eta)(\omega)\right|^{-2}$ , $\omega\in\mathbb{R}$ . Here, $\lambda=\mathbb{E}[\lambda(t)]=\nu/\{1-\int_{\mathbb{R}}\eta(u)du\}\in(0,\infty)$ denotes the first-order intensity of the corresponding Hawkes process.

To verify Assumption 3.2 for Hawkes processes, Jovanović et al. (2015) provides explicit expressions of the cumulant intensity functions, thus one can check (3.2) for the specified cumulant intensity functions. To assess the $\alpha$ -mixing conditions, Cheysson and Lang (2022), Theorem 1, states that if $\int_{0}^{\infty}u^{1+\delta}\eta(u)du<\infty$ , for some $\delta>0$ , then $\sup_{p,q\in(0,\infty)}\alpha_{p,q}(k)=O(k^{-\delta})$ , as $k\rightarrow\infty$ . Therefore, if $\eta(\cdot)$ has a $(2+\delta)$ th-moment (resp. $(4+\delta)$ th-moment) for some $\delta>0$ , then the corresponding Hawkes process satisfies Assumption 3.3(i) (resp. 3.3(ii)). Lastly, to check for Assumption 4.1, one can employ the expression of $f(\boldsymbol{\omega})$ , provided the Fourier transform of $\eta(\cdot)$ has a closed form expression. For example, if the reproduction function has a form $\eta(\cdot)=\alpha\exp(-\beta\cdot)$ for some $0<\alpha<\beta$ , then $f(\omega)-(2\pi)^{-1}\lambda=\alpha(2\beta-\alpha)/\{(\beta-\alpha)^{2}+\omega^{2}\}$ , $\omega\in\mathbb{R}$ . Therefore, the defined $f$ satisfies Assumption 4.1.

5.2 Example II: Neyman-Scott point processes

A Neyman-Scott (N-S) process (Neyman and Scott (1958)) is a special class of the Cox process, where the random latent intensity field is given by $\Lambda(\boldsymbol{u})=\sum_{\boldsymbol{x}\in\Phi}\alpha k(\boldsymbol{u}-\boldsymbol{x})$ , $\boldsymbol{u}\in\mathbb{R}^{d}$ . Here, $\Phi$ is a homogeneous Poisson process with intensity $\kappa>0$ and $k(\cdot)$ is a probability density (kernel) function on $\mathbb{R}^{d}$ . Common choices for $k$ include $k_{1}(\cdot)=(2\pi\sigma^{2})^{-d/2}\exp(-\|\cdot\|^{2}/(2\sigma^{2}))$ and $k_{2}(\cdot)=(r^{d}s_{d})^{-1}I(\|\cdot\|\leq r)$ , where $s_{d}$ is the volume of the unit ball on $\mathbb{R}^{d}$ . The corresponding N-S process for the kernel functions $k_{1}$ and $k_{2}$ are known as the Thomas cluster process and Matérn cluster process. Using, for example, Chandler (1997), Equation (37) (see also Daley and Vere-Jones (2003), Exercise 8.2.9(c)), the spectral density function for the N-S process is given by $f(\boldsymbol{\omega})=(2\pi)^{-d}\lambda\left(1+\alpha|\mathcal{F}(k)(\boldsymbol{\omega})|^{2}\right)$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , where $\lambda=\kappa\alpha$ denotes the first-order intensity of the corresponding N-S process. In particular, the spectral density function of the Thomas cluster process is given by

f^{(TCP)}(\boldsymbol{x})=(2\pi)^{-d}\lambda\left\{1+\alpha\exp(-\sigma^{2}\|\boldsymbol{x}\|^{2})\right\},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(5.1)

Prokešová and Jensen (2013) (page 398) showed that the N-S process satisfies Assumption 3.2 for all $\ell\in\mathbb{N}$ . Moreover, according to Lemma 1 of the same reference, if $k(\boldsymbol{x})=O(\|\boldsymbol{x}\|^{-2d-\varepsilon})$ as $\|\boldsymbol{x}\|\rightarrow\infty$ for some $\varepsilon>0$ (resp. $\varepsilon>2d$ ), then the corresponding N-S process satisfies Assumption 3.3(i) (resp. 3.3(ii)). Assumption 4.1 can be verified for specific kernel functions with known forms of their Fourier transform. In particular, spectral density functions associated with Thomas cluster processes and Matérn cluster processes satisfy Assumption 4.1.

5.3 Example III: Log-Gaussian Cox Processes

The log-Gaussian Cox process (LGCP; Møller et al. (1998)) is a special class of the Cox process where the logarithm of the intensify field is a Gaussian random field. Let $X$ be a stationary LGCP driven by the intensity field $\Lambda(\cdot)$ and let $R(\boldsymbol{x})=\mathrm{cov}(\log\Lambda(\boldsymbol{x}),\log\Lambda(\textbf{0}))$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ , be the autocovariance function of the log intensity field. Then, by using Møller et al. (1998), Equation (4), the second-order reduced cumulant is given by $\gamma_{2,\text{red}}(\boldsymbol{x})=\exp\{R(\boldsymbol{x})\}-1$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ .

Assumption 3.2 holds for arbitrary $\ell\in\mathbb{N}$ , provided $R(\cdot)\in L^{1}(\mathbb{R}^{d})$ . See Zhu et al. (2023), Lemma D.1. Concerning Assumption 3.3, Doukhan (1994), page 59, stated that for a stationary random field on $\mathbb{Z}^{d}$ , if $R(\boldsymbol{x})=O(\|\boldsymbol{x}\|^{-2d-\varepsilon})$ as $\|\boldsymbol{x}\|\rightarrow\infty$ for some $\varepsilon>0$ (resp. $\varepsilon>2d$ ), then the corresponding LGCP on $\mathbb{Z}^{d}$ satisfies Assumption 3.3(i) (resp. 3.3(ii)). However, a general $\alpha$ -mixing condition for stationary Gaussian random fields on $\mathbb{R}^{d}$ is not readily available. Lastly, since the second-order reduced cumulant function has a closed-form expression in terms of $R(\cdot)$ , Assumption 4.1 can be easily verified when $R(\cdot)$ is specified (see the remarks after Assumption 4.1).

5.4 Example IV: Determinantal point processes

The Determinantal point process (DPP), first introduced by Macchi (1975), has the intensity function characterized by the determinant of some function. To be specific, a stationary determinantal point process induced by a kernel function $K:\mathbb{R}^{d}\rightarrow\infty$ , denoted as DPP( $K$ ), has the reduced intensity function $\lambda_{n,\text{red}}(\boldsymbol{x}_{1}-\boldsymbol{x}_{n},\dots,\boldsymbol{x}_{n-1}-\boldsymbol{x}_{n})=\det(K(\boldsymbol{x}_{i}-\boldsymbol{x}_{j})_{1\leq i,j\leq n})$ , $n\in\mathbb{N}$ , $\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{n}\in\mathbb{R}^{d}$ . Here, we assume that the kernel function $K$ is symmetric, continuous, and belongs to $L^{2}(\mathbb{R}^{d})$ satisfying $\mathcal{F}(K)(\boldsymbol{\omega})\in[0,1]$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Then, the second-order reduced cumulant function and the spectral density function of DPP( $K$ ) are respectively given by $\gamma_{2,\text{red}}(\boldsymbol{x})=-|K(\boldsymbol{x})|^{2}$ and $f(\boldsymbol{\omega})=(2\pi)^{-d}K(0)-\mathcal{F}^{-1}(|K|^{2})(\boldsymbol{\omega})$ . Therefore, the DPPs exhibit a repulsive behavior. For example, choosing the Gaussian kernel $K^{(G)}(\boldsymbol{x})=\lambda\exp(-\|\boldsymbol{x}\|^{2}/\rho^{2})$ with parameter restriction $0<\rho\leq 1/(\sqrt{\pi}\lambda^{1/d})$ , the spectral density function corresponds to $DPP(K^{(G)})$ is given by

f^{(GDPP)}(\boldsymbol{\omega})=(2\pi)^{-d}\left\{\lambda-\lambda^{2}(\pi\rho^{2}/2)^{d/2}\exp(-\rho^{2}\|\boldsymbol{\omega}\|^{2}/8)\right\},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(5.2)

Biscio and Lavancier (2016) showed that DPP( $K$ ) satisfies Assumption 3.2 for any $\ell\in\mathbb{N}$ . Moreover, Poinas et al. (2019) showed $\alpha_{p,p}(k)/p\leq C\int_{k}^{\infty}t^{d-1}\sup_{\|\boldsymbol{x}\|=t}|\gamma_{2,\text{red}}(\boldsymbol{x})|dt$ , $p\in(0,\infty)$ . Therefore, if there exists $\varepsilon>0$ (resp. $\varepsilon>2d$ ) such that $\sup_{\|\boldsymbol{x}\|=t}|K(\boldsymbol{x})|=O(t^{-d-(\varepsilon/2)})$ as $t\rightarrow\infty$ , then DPP( $K$ ) satisfies Assumption 3.3(i) (resp. 3.3(ii)). Therefore, DPP with the Gaussian kernel $K^{(G)}$ satisfies Assumption 3.3(ii). Lastly, Assumption 4.1 can be verified provided $|K|^{2}$ has a known Fourier transform, including the case of $f^{(GDPP)}(\cdot)$ in (5.2).

6 Frequency domain parameter estimation under possible model misspecification

As an application of the CLT results in Section 4, we turn our attention to inferences for spatial point processes in the frequency domain through the Whittle likelihood. Let $\{X_{\boldsymbol{\theta}}\}$ be a family of second-order stationary spatial point processes with parameter $\boldsymbol{\theta}\in\Theta\subset\mathbb{R}^{p}$ . The associated spectral density function of $X_{\boldsymbol{\theta}}$ is denoted as $f_{\boldsymbol{\theta}}$ , $\boldsymbol{\theta}\in\Theta$ . Then, we fit the model with spectral density $f_{\boldsymbol{\theta}}$ using the pseudo-likelihood given by

L_{n}(\boldsymbol{\theta})=\int_{D}\left(\frac{\widehat{I}_{h,n}(\boldsymbol{\omega})}{f_{\boldsymbol{\theta}}(\boldsymbol{\omega})}+\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega})\right)d\boldsymbol{\omega},\quad n\in\mathbb{N},\quad\boldsymbol{\theta}\in\Theta\subset\mathbb{R}^{p}.

(6.1)

Here, $D$ is a prespecified compact and symmetric region on $\mathbb{R}^{d}$ . Let

\widehat{\boldsymbol{\theta}}_{n}=\arg\min_{\boldsymbol{\theta}\in\Theta}L_{n}(\boldsymbol{\theta}),\qquad n\in\mathbb{N},

(6.2)

be our proposed model parameter estimator. Here, we do not necessarily assuming the existence of $\boldsymbol{\theta}_{0}\in\Theta$ such that the true spectral density function $f=f_{\boldsymbol{\theta}_{0}}$ . Since the periodogram is an unbiased estimator of the “true” spectral density, the best fitting parameter could be

\boldsymbol{\theta}_{0}=\arg\min_{\boldsymbol{\theta}\in\Theta}\mathcal{L}(\boldsymbol{\theta}),\quad\text{where}\quad\mathcal{L}(\boldsymbol{\theta})=\int_{D}\left(\frac{f(\boldsymbol{\omega})}{f_{\boldsymbol{\theta}}(\boldsymbol{\omega})}+\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega})\right)d\boldsymbol{\omega}.

(6.3)

The best fitting parameter $\boldsymbol{\theta}_{0}$ has a clear interpretation in terms of the spectral divergence criterion, as $\mathcal{L}(\boldsymbol{\theta})$ above computes the spectral (information) divergence between the true and conjectured spectral densities. Investigations into the properties of the Whittle estimator under model misspecification for time series and random fields can be found in Dahlhaus and Wefelmeyer (1996); Dahlhaus and Sahm (2000); Subba Rao and Yang (2021). We mention that, in the case where the mapping $\boldsymbol{\theta}\rightarrow f_{\boldsymbol{\theta}}$ is injective and the model is correctly specified, it can be easily seen that $\boldsymbol{\theta}_{0}$ is uniquely determined and satisfies $f=f_{\boldsymbol{\theta}_{0}}$ .

Now, we assume the following for the parameter space.

Assumption 6.1.

The parameter space $\Theta$ is a compact subset of $\mathbb{R}^{p}$ , $p\in\mathbb{N}$ . The parametric family of spectral density functions $\{f_{\boldsymbol{\theta}}\}$ is uniformly bounded above and bounded below from zero. $f_{\boldsymbol{\theta}}(\boldsymbol{\omega})$ is twice differentiable with respect to $\boldsymbol{\theta}$ and its first and second derivatives are continuous on $\Theta\times D$ . $\boldsymbol{\theta}_{0}$ in (6.3) is uniquely determined and lies in the interior of $\Theta$ . Lastly, $\widehat{\boldsymbol{\theta}}_{n}$ in (6.2) exists for all $n\in\mathbb{N}$ and lies in the interior of $\Theta$ .

To obtain the asymptotic variance of $\widehat{\boldsymbol{\theta}}_{n}$ , for $\boldsymbol{\theta}\in\Theta$ , let

$\displaystyle\Gamma(\boldsymbol{\theta})$	$\displaystyle=\frac{1}{2(2\pi)^{d}}\int_{D}\bigg{[}\left(f(\boldsymbol{\omega})-f_{\boldsymbol{\theta}}(\boldsymbol{\omega})\right)\nabla^{2}f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega})+(\nabla\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega}))(\nabla\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega}))^{\top}\bigg{]}d\boldsymbol{\omega},$	(6.4)
$\displaystyle S_{1}(\boldsymbol{\theta})$	$\displaystyle=\frac{1}{2(2\pi)^{d}}\int_{D}f(\boldsymbol{\omega})^{2}(\nabla f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega}))(\nabla f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega}))^{\top}d\boldsymbol{\omega},\quad\text{and}$
$\displaystyle S_{2}(\boldsymbol{\theta})$	$\displaystyle=\frac{1}{4(2\pi)^{d}}\int_{D^{2}}f_{4}(\boldsymbol{\omega}_{1},-\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})(\nabla f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega}))(\nabla f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega}))^{\top}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

Here, $\nabla f_{\boldsymbol{\theta}}$ and $\nabla^{2}f_{\boldsymbol{\theta}}$ are the first- and second-order derivatives of $f_{\boldsymbol{\theta}}$ with respect to $\boldsymbol{\theta}$ , respectively, and $f_{4}$ denotes the (true) fourth-order spectral density of $X$ . In the scenario where the model is correctly specified, we have $\Gamma(\boldsymbol{\theta}_{0})=S_{1}(\boldsymbol{\theta}_{0})$ .

The following theorem addresses the asymptotic behavior of the our proposed estimator under possible model misspecification.

Theorem 6.1.

Let $X$ be a fourth-order stationary point process on $\mathbb{R}^{d}$ , that is, (2.3) is satisfied for $k=4$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), 3.4(ii) (for $m=1$ ), 4.1, and 6.1 hold. Then,

\widehat{\boldsymbol{\theta}}_{n}\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}\boldsymbol{\theta}_{0},\quad n\rightarrow\infty.

(6.5)

Now, let $d\in\{1,2,3\}$ . Suppose $\Gamma(\boldsymbol{\theta}_{0})$ in (6.4) is invertible. Then, under Assumptions 3.1, 3.2 (for $\ell=8$ ), 3.3(ii), 3.4(ii) (for $m=d+1$ ), 4.1, 4.2, and 6.1, we have

|D_{n}|^{1/2}(\widehat{\boldsymbol{\theta}}_{n}-\boldsymbol{\theta}_{0})\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(\textbf{0},(H_{h,4}/H_{h,2}^{2})\Gamma(\boldsymbol{\theta}_{0})^{-1}\left(S_{1}(\boldsymbol{\theta}_{0})+S_{2}(\boldsymbol{\theta}_{0})\right)\Gamma(\boldsymbol{\theta}_{0})^{-1}\right),~{}~{}~{}n\rightarrow\infty.

(6.6)

Proof.

See Appendix B.5. ∎

Remark 6.1.

The condition on $d$ being less than four in the asymptotic normality of $\widehat{\boldsymbol{\theta}}_{n}$ is required to ensure that the bias of $|D_{n}|^{1/2}\widehat{\boldsymbol{\theta}}_{n}$ converges to zero. This restriction is also imposed in the random fields literature (e.g., Dahlhaus and Künsch (1987); Matsuda and Yajima (2009)). By using the debiasing technique considered in Guillaumin et al. (2022), one can establish the asymptotic normality of the “debiased” Whittle estimator for all $d\in\mathbb{N}$ . The details will be reported in a future study.

Remark 6.2.

We provide a summary of the procedure for estimating the asymptotic variance of $\widehat{\boldsymbol{\theta}}_{n}$ . Recall (6.6). $\Gamma(\boldsymbol{\theta}_{0})$ can be easily estimated by replacing $f(\boldsymbol{\omega})$ and $\boldsymbol{\theta}_{0}$ with $\widehat{f}_{n,b}(\boldsymbol{\omega})$ and $\widehat{\boldsymbol{\theta}}_{n}$ , respectively, in (6.4). To estimate $S_{1}(\boldsymbol{\theta}_{0})+S_{2}(\boldsymbol{\theta}_{0})$ one can employ the subsampling variance estimation method for $A_{h,n}(\nabla f_{\widehat{\boldsymbol{\theta}}_{n}}^{-1})$ , as described in Appendix G (see also, Remark 4.1). The theoretical properties of this estimated variance will not be investigated in this article.

7 Simulation studies

To corroborate our theoretical results, we conduct some simulations on the model parameter estimation. Additional simulation results can be found in Appendices H.2–H.4. Due to space constraints, we only consider the following two point process models on $\mathbb{R}^{2}$ :

•

Stationary Thomas cluster processes (TCPs) with parameter $\boldsymbol{\theta}=(\kappa,\alpha,\sigma^{2})^{\top}$ as in Section 5.2. The spectral density function of TCP, denoted as $f^{(TCP)}_{\boldsymbol{\theta}}(\cdot)$ , is given in (5.1) with the first-order intensity $\lambda=\kappa\alpha$ . TCP exhibits clustering behavior.
•

Stationary determinantal point processes with Gaussian kernel (GDPPs) with parameter $\boldsymbol{\theta}=(\lambda,\rho^{2})^{\top}$ as in Section 5.4. The spectral density function of GDPP, denoted as $f^{(GDPP)}_{\boldsymbol{\theta}}(\cdot)$ , is given in (5.2). GDPP exhibits repulsive behavior.

For each model, we generate spatial point patterns within the observation domain (window) $D_{n}=[-A/2,A/2]^{2}$ for varying side lengths $A\in\{10,20,40\}$ . To assess the performance of the different parameter estimation methods, we compare our estimator as in (6.2) with two existing methods in the spatial domain: the maximum likelihood-based method (ML) and the minimum contrast method (MC).

Specifically, for the ML method, given the intractable nature of the likelihood functions for TCPs and GDPPs, we maximize the log-Palm likelihood (Tanaka et al. (2008)) for the TCPs and use the asymptotic approximation of the likelihood (Poinas and Lavancier (2023)) for GDPPs. For the MC method, we minimize the contrast function of form $K(\boldsymbol{\theta})=\int_{r_{\text{min}}}^{r_{\text{max}}}|g(t;\boldsymbol{\theta})^{c}-\widehat{g}(t)^{c}|^{2}dt$ where $g(\cdot;\boldsymbol{\theta})$ denotes the parametric pair correlation function (PCF) for the isotropic process and $\widehat{g}(\cdot)$ is an estimator of the PCF. Since the PCF of the TCP model does not include the parameter $\alpha$ , we do not include the estimation of $\alpha$ in the MC method for TCPs. Similarly, as the PCF of GDPP is solely a function of $\rho^{2}$ , we do not include the estimation of $\lambda$ in the MC method for GDPPs. Finally, following the guidelines from Biscio and Lavancier (2017) (see also, Diggle (2013)), for MC methods, we choose the tuning parameters $r_{\text{min}}=0.01A$ and $r_{\text{max}}=0.25A$ where $A\in\{10,20,40\}$ is the length of the window and $c=0.25$ for TCPs and $c=0.5$ for GDPPs.

Lastly, all simulations are conducted over 500 independent replications of spatial point patterns and for each replication, we compute the three previously mentioned three model parameter estimators.

7.1 Practical guidelines for the frequency domain method

We now discuss three practical issues arising during the evaluation of our estimator.

Choice of the data taper. We use the data taper $h(\boldsymbol{x})=\prod_{j=1}^{d}h_{0.025}(x_{j})$ , where for $a\in(0,1/2)$ ,

h_{a}(x)=\begin{cases}(x+0.5)/a-\frac{1}{2\pi}\sin(2\pi(x+0.5)/a),&-1/2\leq x\leq(-1/2)+a.\\ 1,&(-1/2)+a<x<(1/2)-a.\\ h_{a}(-x),&(1/2)-a<x\leq 1/2.\end{cases}

(7.1)

Then, it is easily seen that $h$ satisfies (3.4), in turn, meeting the condition on $h$ in Theorem 6.1. However, in our simulations, selection of $a\in(0,1/2)$ seems not notably impact the performance of our estimator.

Choice of $D$ . In practice, we select the prespecified domain $D\subset\mathbb{R}^{d}$ for the Whittle likelihood in (6.1) as $D=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:d_{0}\leq\|\boldsymbol{\omega}\|_{\infty}\leq d_{1}\}$ for some $0\leq d_{0}<d_{1}<\infty$ . Inspecting Theorem 3.1 (also Theorem B.1 in the Appendix) we exclude the frequencies near the origin (corresponding to frequencies such that $\|\boldsymbol{\omega}\|_{\infty}<d_{0}$ ) due to the large bias of the periodogram at frequencies close to the origin. The upper bound $d_{1}\in(0,\infty)$ can be chosen such that $|f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda|\approx 0$ for $\|\boldsymbol{\omega}\|_{\infty}>d_{1}$ , ensuring information outside $D$ has little contribution to the form of spectral density function. In case where no information on the true spectral density function is available, $f$ can be replaced with its kernel smoothed periodogram $\widehat{f}_{n,b}$ in the selection criterion of $d_{1}$ .

Discretization. Since the Whittle likelihood $L_{n}(\boldsymbol{\theta})$ in (6.1) is defined as an integral, we approximate $L_{n}(\boldsymbol{\theta})$ with its Riemann sum

L_{n}^{(R)}(\boldsymbol{\theta})=\sum_{\boldsymbol{\omega}_{\boldsymbol{k},\Omega}\in D}\left(\frac{\widehat{I}_{h,n}(\boldsymbol{\omega}_{\boldsymbol{k},\Omega})}{f_{\boldsymbol{\theta}}(\boldsymbol{\omega}_{\boldsymbol{k},\Omega})}+\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega}_{\boldsymbol{k},\Omega})\right),\quad n\in\mathbb{N},\quad\boldsymbol{\theta}\in\Theta\subset\mathbb{R}^{p},

(7.2)

where for $\Omega>0$ and $\boldsymbol{k}=(k_{1},k_{2})^{\top}\in\mathbb{Z}^{2}$ , $\boldsymbol{\omega}_{\boldsymbol{k},\Omega}=(2\pi k_{1}/\Omega,2\pi k_{2}/\Omega)^{\top}$ . The feasible criterion of $\widehat{\boldsymbol{\theta}}_{n}$ in (6.2) is $\widehat{\boldsymbol{\theta}}_{n}^{(R)}=\arg\min_{\boldsymbol{\theta}\in\Theta}L_{n}^{(R)}(\boldsymbol{\theta})$ . An efficient way to compute the periodograms on a grid is discussed in Appendix H.2.

In simulations, we set $\Omega=A$ , where $A>0$ is the side length of the window. As a theoretical justification, Subba Rao (2018) proved the asymptotic normality of the averaged periodogram under an irregularly spaced spatial data framework. She also showed that setting $\Omega\propto A$ is ”optimal” in the sense that a finer grid ( $\Omega>>A$ ) does not improve the variance of the averaged periodogram. However, we do not yet have theoretical results for the asymptotics of $\widehat{\boldsymbol{\theta}}_{n}^{(R)}$ in the spatial point process framework. These will be investigated in future research.

7.2 Results under correctly specified models

In this section, we simulate the spatial point patterns from the TCP model with parameter $(\kappa_{0},\alpha_{0},\sigma_{0}^{2})=(0.2,10,0.5^{2})$ and the GDPP model with parameter $(\lambda_{0},\rho_{0}^{2})=(1,0.55^{2})$ . For spatial patterns generated by the TCPs (resp. GDPPs), we fit the parametric TCP models (resp. GDPP models) using three different estimation methods. Following the guideline in Section 7.1, we set the prespecified domain $D_{2\pi}=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:\frac{1}{10}\pi\leq\|\boldsymbol{\omega}\|_{\infty}\leq 2\pi\}$ in our estimator for both TCP and GDPP. This choice captures the shape of the spectral densities without adding unnecessary computation.

The bias and standard errors of three different methods are presented in Table 1. See also, Figures H.2 and H.3 in the Appendix for the empirical distributions

Model	Window	Parameter	Method
Model	Window	Parameter	Ours	ML	MC
TCP	$[-5,5]^{2}$	$\kappa$	-0.04(0.11)	-0.07(0.70)	-0.04(0.10)
		$\alpha$	0.72(3.52)	-0.62(9.51)	—
		$\sigma^{2}$	0.02(0.07)	-0.06(0.35)	0.01(0.10)
		Time(sec)	0.74	0.38	0.07
	$[-10,10]^{2}$	$\kappa$	-0.02(0.05)	-0.02(0.05)	-0.01(0.05)
		$\alpha$	0.60(1.77)	0.24(3.37)	—
		$\sigma^{2}$	0.01(0.04)	-0.02(0.20)	0.00(0.06)
		Time(sec)	2.38	5.67	0.23
	$[-20,20]^{2}$	$\kappa$	-0.01(0.04)	0.00(0.03)	0.00(0.03)
		$\alpha$	0.25(1.03)	0.15(1.23)	—
		$\sigma^{2}$	0.01(0.02)	0.00(0.03)	0.00(0.03)
		Time(sec)	9.15	173.66	1.95
GDPP	$[-5,5]^{2}$	$\lambda$	0.00(0.10)	0.03(0.07)	—
		$\rho^{2}$	0.01(0.09)	-0.02(0.03)	0.05(0.07)
		Time(sec)	0.14	1.84	0.05
	$[-10,10]^{2}$	$\lambda$	0.00(0.06)	0.01(0.04)	—
		$\rho^{2}$	0.01(0.04)	-0.01(0.01)	0.02(0.03)
		Time(sec)	0.39	30.70	0.08
	$[-20,20]^{2}$	$\lambda$	0.00(0.03)	0.01(0.02)	—
		$\rho^{2}$	0.00(0.02)	0.00(0.01)	0.00(0.02)
		Time(sec)	1.62	590.02	0.67

Table 1: The bias and the standard errors (in the parentheses) of the estimated parameters based on three different approaches for the TCP and the GDPP. The true parameters are

(\kappa_{0},\alpha_{0},\sigma_{0}^{2})=(0.2,10,0.5^{2})

for TCP and

(\lambda_{0},\rho_{0}^{2})=(1,0.55^{2})

for GDPP. The time is calculated as an averaged computational time (using a parallel computing in R on a desktop computer with an i7-10700 Intel CPU) of each method per one simulation from 500 independent replications. We use bold to denote the smallest RMSE.

First, we examine the accuracy of the estimators. Our estimator exhibits the smallest root-mean-squared erorr (RMSE) of $\alpha$ and $\sigma^{2}$ in the TCP model across all windows; has the smallest RMSE for $\kappa$ when $D_{n}=[-5,5]^{2}$ ; and has reliable estimation results for the GDPP model. The MC estimator performs well for both the TCP and GDPP models, achieving the smallest RMSE for $\kappa$ in the TCP model for $D_{n}=[-10,10]^{2}$ and $[-20,20]^{2}$ . The ML estimator consistently performs the best for the GDPP model. For the TCP model, the ML estimators of $\kappa$ and $\sigma^{2}$ yield satisfactory finite sample results, but that of $\alpha$ exhibits a relatively large standard error. Overall, the biases and standard errors of all three estimators tend to decrease to zero as the window size increases.

Moving on, we consider the computation time of each method. Firstly, the MC method has the fastest computation time for both models and all windows. Secondly, the ML method exhibits a reasonable computation speed for both models when $D_{n}=[-5,5]^{2}$ , yielding the expected number of observations equal to 200 (for TCP) or 100 (for GDPP). However, when the number of observations are in the order of a few thousands (corresponding to $D_{n}=[-20,20]^{2}$ ), the ML method incurs the longest computational time. Lastly, our method takes less than 10 seconds to compute the parameter estimates for the TCP model and less than 2 seconds for the GDPP model both under $D_{n}=[-20,20]^{2}$ . Specifically, the computation for our method involves two steps: (a) evaluation of $\{\widehat{I}_{h,n}(\boldsymbol{\omega}_{\boldsymbol{k},A}):\boldsymbol{\omega}_{\boldsymbol{k},A}\in D_{2\pi}\}$ and (b) optimization of $L_{n}^{(R)}(\boldsymbol{\theta})$ in (7.2). Once step (a) is done, there is no need to update the set of periodograms in the optimization step. In the simulations, it takes, on average, less than 0.5 seconds to evaluate $\{\widehat{I}_{h,n}(\boldsymbol{\omega}_{\boldsymbol{k},40}):\boldsymbol{\omega}_{\boldsymbol{k},40}\in D_{2\pi}\}$ for both TCP and GDPP. This indicates that most of the computational burden for our method stems from the optimization of the Whittle likelihood. By employing a coarse grid in (7.2), i.e., using $\Omega=A/2$ instead of $\Omega=A$ , the computational time can dramatically decrease.

As a final note, bear in mind that the number of parameters for the MC method is one for TCP and two for GDPP, representing a lower parameter count compared to the corresponding ours or ML estimators. Additionally, the contrast function of the MC method for both TCP and GDPP is specifically designed for isotropic processes. Therefore, for the models we consider in this simulations, the MC method clearly holds a computational advantage over both our method and the ML method.

7.3 Results under misspecified models

Now, we consider the case when the models fail to identify the true point patterns. For the data-generating process, we simulate from the LGCP model driven by the latent intensity field $\Lambda(\boldsymbol{x})$ , $\boldsymbol{x}\in\mathbb{R}^{2}$ , where the first-order intensity is $\lambda^{(true)}=\exp(0.5)\approx 1.65$ and the covariance function is $R(\boldsymbol{x})=\mathrm{cov}(\log\Lambda(\boldsymbol{x}),\log\Lambda(\textbf{0}))=2\exp(-\|\boldsymbol{x}\|)$ , $\boldsymbol{x}\in\mathbb{R}^{2}$ . Following the arguments in Section 5.3, the true spectral density function $f(\boldsymbol{\omega})$ , $\boldsymbol{\omega}\in\mathbb{R}^{2}$ , is given by

f(\boldsymbol{\omega})=(2\pi)^{-d}\left[\lambda^{(true)}+(\lambda^{(true)})^{2}\int_{\mathbb{R}^{2}}\left(\exp\{2\exp(-\|\boldsymbol{x}\|)\}-1\right)e^{-i\boldsymbol{x}^{\top}\boldsymbol{\omega}}d\boldsymbol{x}\right].

(7.3)

In each simulation, we fit the TCP model with parameter $\boldsymbol{\theta}=(\kappa,\alpha,\sigma^{2})^{\top}$ . To examine the effect of the selection of the prespecified domain, we use $D_{2\pi}=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:\frac{1}{10}\pi\leq\|\boldsymbol{\omega}\|_{\infty}\leq 2\pi\}$ and $D_{5\pi}=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:\frac{1}{10}\pi\leq\|\boldsymbol{\omega}\|_{\infty}\leq 5\pi\}$ when evaluating our estimator.

Next, we consider the best fitting TCP model. The best fitting TCP parameter is given by $\boldsymbol{\theta}_{0}(D,A)=\arg\min_{\boldsymbol{\theta}\in\Theta}\mathcal{L}^{(R)}(\boldsymbol{\theta})$ , where

\mathcal{L}^{(R)}(\boldsymbol{\theta})=\sum_{\boldsymbol{\omega}_{\boldsymbol{k},A}\in D}\left(\frac{f(\boldsymbol{\omega}_{\boldsymbol{k},A})}{f_{\boldsymbol{\theta}}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})}+\log f_{\boldsymbol{\theta}}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})\right),\quad\boldsymbol{\theta}\in\Theta,

(7.4)

is the Riemann sum analogue of $\mathcal{L}(\boldsymbol{\theta})$ in (6.3). Figure 2 illustrates the (log-scale of) the true spectral density ( $f$ ; solid line) and the best fitting TCP spectral density functions $f_{\boldsymbol{\theta}}^{(TCP)}$ for $\boldsymbol{\theta}=\boldsymbol{\theta}_{0}(D_{2\pi},10)$ (dashed line) and $\boldsymbol{\theta}=\boldsymbol{\theta}_{0}(D_{5\pi},10)$ (dotted line). We note that the best TCP spectral densities evaluated on two different domains ( $D_{2\pi}$ and $D_{5\pi}$ ) have distinct characteristics. In detail, $f_{\boldsymbol{\theta}}^{(TCP)}$ for $\boldsymbol{\theta}=\boldsymbol{\theta}_{0}(D_{2\pi},10)$ captures the peak and the curvature of the true spectral density more accurately but fails to identify the true asymptote (horizontal line with amplitude $(2\pi)^{-2}\lambda^{(true)}$ ). On the other hand, $f_{\boldsymbol{\theta}}^{(TCP)}$ for $\boldsymbol{\theta}=\boldsymbol{\theta}_{0}(D_{5\pi},10)$ successfully captures the asymptote of the true density but underestimates the power near the origin.

Table 2 below summarizes parameter estimation results. The empirical distribution of each estimator can be found in Figure H.4 in Appendix. For the parameter fitting, we also report the estimation of the first-order intensity $\lambda=\kappa\alpha$ .

Window	Par.	Best Par.		Method
Window	Par.	$D_{2\pi}$	$D_{5\pi}$	Ours( $D_{2\pi}$ )	Ours( $D_{5\pi}$ )	ML	MC
$[-5,5]^{2}$	$\kappa$	0.32	0.25	0.38(0.23)	0.38(0.23)	0.43(1.73)	0.25(0.28)
	$\alpha$	7.46	7.08	7.49(7.75)	6.41(5.19)	14.96(19.59)	—
	$\sigma^{2}$	0.17	0.10	0.32(0.80)	0.16(0.28)	0.34(0.34)	0.24(0.17)
	$\lambda=\kappa\alpha$	2.38	1.79	2.16(1.34)	1.76(0.75)	1.49(0.53)	—
	Time(sec)	—	—	0.61	3.43	0.21	0.07
$[-10,10]^{2}$	$\kappa$	0.31	0.24	0.36(0.13)	0.30(0.10)	0.12(0.07)	0.13(0.06)
	$\alpha$	7.74	7.37	7.33(3.79)	6.88(3.69)	20.02(18.46)	—
	$\sigma^{2}$	0.18	0.10	0.20(0.08)	0.11(0.05)	0.48(0.53)	0.36(0.25)
	$\lambda=\kappa\alpha$	2.43	1.80	2.36(0.92)	1.79(0.43)	1.60(0.31)	—
	Time(sec)	—	—	1.68	11.13	2.99	0.16
$[-20,20]^{2}$	$\kappa$	0.32	0.25	0.34(0.08)	0.27(0.06)	0.09(0.03)	0.09(0.03)
	$\alpha$	7.63	7.13	7.53(2.21)	7.16(2.36)	20.44(10.52)	—
	$\sigma^{2}$	0.18	0.10	0.19(0.04)	0.11(0.03)	0.41(0.20)	0.46(0.19)
	$\lambda=\kappa\alpha$	2.44	1.80	2.44(0.59)	1.81(0.24)	1.64(0.17)	—
	Time(sec)	—	—	5.17	40.45	71.16	1.36

Table 2: The average and the standard errors (in the parentheses) of the estimated parameters for the misspecified LGCP fitting with the TCP model. The best fitting parameters are calculated by minimizing

\mathcal{L}^{(R)}(\boldsymbol{\theta})

in (7.4). When evaluating our estimator, we use two different prespecified domains,

D_{2\pi}

and

D_{5\pi}

We first discuss the best fitting parameters. As already shown in Figure 2 above, the best fitting parameter results reveal substantial differences between $D_{2\pi}$ and $D_{5\pi}$ . Surprisingly, the first-order intensity of the best fitting TCP model on $D_{2\pi}$ is $2.38$ which significantly deviates from the true first-intensity ( $\approx 1.65$ ). The rationale behind this discrepancy is that the true spectral density function $f(\boldsymbol{\omega})$ in $\boldsymbol{\omega}\in D_{2\pi}$ is still distant from its asymptote value $(2\pi)^{-2}\lambda^{(true)}$ . Therefore, the information contained in $\{\widehat{I}_{h,n}(\boldsymbol{\omega}):\boldsymbol{\omega}\in D_{2\pi}\}$ proves insufficient for estimating the true first-order intensity. As a remedy for the discrepancy between the fitted first-order intensity and the true first-order intensity, one may fit the ”reduced” TCP model. Specifically, we can fit the TCP model with a parameter constraint $\alpha=\widehat{\lambda}_{n}/\kappa$ , where $\widehat{\lambda}_{n}=N_{X}(D_{n})/|D_{n}|$ is the unbiased estimator of $\lambda^{(\text{true})}$ . One advantage of using such a parameter constraint is that the estimated first-order intensity of the reduced TCP model always correctly estimates the true first-order intensity. Please refer to Appendix H.4 for details on the construction of the reduced TCP model and its parameter fitting results.

Next, we examine the estimation outcomes from different methods. As the window size increases, our estimators evaluated on $D_{2\pi}$ and $D_{5\pi}$ converge toward the corresponding best fitting parameters. These results substantiate the asymptotic behavior of our estimator in Theorem 6.1. For the ML estimator, the standard errors of the estimation of $\kappa$ and $\sigma^{2}$ decrease as the window size increases, however, the standard error of the estimated $\alpha$ is still large even the window size is large. It is observed that $\sigma^{2}$ tends to converges to a fixed value (approximately 1.6), but the estimated $\alpha$ value (resp. estimated $\kappa$ value) tends to increase (resp. decrease) as $D_{n}$ increases. It is intriguing that the ML estimator estimates the true first-order intensity of the process even the model is misspecified. However, to the best of our knowledge, there are no theoretical results available for the ML estimator under model misspecification. Lastly, the standard errors of the MC estimators decreases as the window increases. However, based on the results on Table 2, it is not clear that the MC estimator under model misspecification converges to a fixed parameter which is non-shrinking or non-diverging.

Lastly, our estimator based on the prespecified domain $D_{2\pi}$ is reasonably fast even for the largest window $D_{n}=[-20,20]^{2}$ . However, when the prespecified domain is $D_{5\pi}$ , the computation time can take up to 40 seconds per simulation. This is because when computing $L_{n}^{(R)}(\boldsymbol{\theta})$ , the number of computational grids for $D_{5\pi}$ is about $(5/2)^{2}=6.25$ times larger than that of $D_{2\pi}$ . To reduce the computation time, one may consider using a coarse grid on $D_{5\pi}$ such as using a coarse grid $\Omega=A/2$ .

8 Concluding remarks and possible extensions

In this article, we study the frequency domain estimation and inferential methods for spatial point processes. We show that the DFTs for spatial point processes still satisfy the asymptotic joint Gaussianity, which is a classical result for the DFTs applicable to time series or spatial statistics. Our approach accommodates irregularly scattered point patterns, thus the fast Fourier transform algorithm is not applicable for evaluating the DFTs on a set of grids. Nevertheless, our simulations indicate that the DFT based model parameter estimation method remains computationally attractive with satisfactory finite sample performance. The advantage of our method becomes more pronounced when fitting a model with misspecification. We prove that our proposed model parameter estimator is asymptotically Gaussian, estimating the “best” fitting parameter that minimizes the spectral divergence between the true and conjectured spectra. According to our simulation results, it appears that our method is the only promising approach that exhibits satisfactory large sample properties under model misspecification, distinguishing itself from other two spatial domain methods—the likelihood-based method and least square method.

We anticipate that our frequency domain approaches can be well extended to multivariate point processes. In multivariate case, one need to consider the “joint” higher-order intensity and cumulant intensity functions, as introduced in Zhu et al. (2023), Section 2. Additionally, the sets of assumptions presented in this paper need appropriate reformulation (see Section 4.1 of the same reference). In Appendix I, we show that our DFT-based approaches can also be extended to the class of inhomogeneous processes, but a significant portion of the theoretical development on frequency domain methods of the inhomogeneous processes are remained open. This will be a good revenue for the furture research.

Acknowledgments

JY acknowledge the support of the Taiwan’s National Science and Technology Council (grants 110-2118-M-001-014-MY3 and 113-2118-M-001-012). The authors thank Hsin-Cheng Huang and Suhasini Subba Rao for fruitful comments and suggestions, and Qi-Wen Ding for assistance with simulations. The authors also wish to thank the two anonymous referees and editors for their valuable comments and corrections, which have greatly improved the article in all aspects.

References

Baddeley et al. (2000) A. J. Baddeley, J. Møller, and R. Waagepetersen. Non-and semi-parametric estimation of interaction in inhomogeneous point patterns. Stat. Neerl., 54(3):329–350, 2000. doi: 10.1111/1467-9574.00144.
Bandyopadhyay and Lahiri (2009) S. Bandyopadhyay and S. N. Lahiri. Asymptotic properties of discrete Fourier transforms for spatial data. Sankhya A, 71(2):221–259, 2009.
Bartlett (1964) M. S. Bartlett. The spectral analysis of two-dimensional point processes. Biometrika, 51(3/4):299–311, 1964. doi: 10.2307/2334136.
Biscio and Lavancier (2016) C. A. N. Biscio and F. Lavancier. Brillinger mixing of determinantal point processes and statistical applications. Electron. J. Stat., 10(1):582–607, 2016. doi: 10.1214/16-EJS1116.
Biscio and Lavancier (2017) C. A. N. Biscio and F. Lavancier. Contrast estimation for parametric stationary determinantal point processes. Scand. J. Stat., 44(1):204–229, 2017. doi: 10.1111/sjos.12249.
Biscio and Waagepetersen (2019) C. A. N. Biscio and R. Waagepetersen. A general central limit theorem and a subsampling variance estimator for $\alpha$ -mixing point processes. Scand. J. Stat., 46(4):1168–1190, 2019. doi: 10.1111/sjos.12389.
Brillinger (1981) D. R. Brillinger. Time series: Data Analysis and Theory. Holden-Day, INC., San Francisco, CA, 1981. Expanded edition.
Chandler (1997) R. E. Chandler. A spectral method for estimating parameters in rainfall models. Bernoulli, 3(3):301–322, 1997. doi: 10.2307/3318594.
Cheysson and Lang (2022) F. Cheysson and G. Lang. Spectral estimation of Hawkes processes from count data. Ann. Statist., 50(3):1722–1746, 2022. doi: 10.1214/22-AOS2173.
Choiruddin et al. (2021) A. Choiruddin, J.-F. Coeurjolly, and R. Waagepetersen. Information criteria for inhomogeneous spatial point processes. Aust. N. Z. J. Stat., 63(1):119–143, 2021. doi: 10.1111/anzs.12327.
Dahlhaus (1997) R. Dahlhaus. Fitting time series models to nonstationary processes. Ann. Statist., 25(1):1–37, 1997. doi: 10.1214/aos/1034276620.
Dahlhaus and Künsch (1987) R. Dahlhaus and H. Künsch. Edge effects and efficient parameter estimation for stationary random fields. Biometrika, 74(4):877–882, 1987. doi: 10.1093/biomet/74.4.877.
Dahlhaus and Sahm (2000) R. Dahlhaus and M. Sahm. Likelihood methods for nonstationary time series and random fields. Resenhas IME-USP (São Paulo J. Math. Sci.), 4(4):457–477, 2000.
Dahlhaus and Wefelmeyer (1996) R. Dahlhaus and W. Wefelmeyer. Asymptotically optimal estimation in misspecified time series models. Ann. Statist., 24(3):952–974, 1996. doi: 10.1214/aos/1032526951.
Daley and Vere-Jones (2003) D. J. Daley and D. Vere-Jones. An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods. Springer, New York City, NY., 2003. doi: 10.1007/b97277. second edition.
D’Angelo et al. (2022) N. D’Angelo, G. Adelfio, A. Abbruzzo, and J. Mateu. Inhomogeneous spatio-temporal point processes on linear networks for visitors’ stops data. Ann. Appl. Stat., 16(2):791–815, 2022. doi: 10.1214/21-AOAS1519.
Diggle (2013) P. J. Diggle. Statistical Analysis of Spatial and Spatio-Temporal Point Patterns. CRC press, New York, NY, 2013. doi: 10.1201/b15326. 3rd edition.
Ding et al. (2024) Q.-W. Ding, J. Shin, and J. Yang. Pseudo-spectra of multivariate inhomogeneous spatial point processes. Technical report, 2024.
Doukhan (1994) P. Doukhan. Mixing: Properties and Examples. Springer, New York City, NY., 1994. doi: 10.1007/978-1-4612-2642-0.
Folland (1999) G. B. Folland. Real Analysis: Modern Techniques and Their Applications, volume 40. John Wiley & Sons, New York, NY, 1999. 2nd edition.
Gabriel and Diggle (2009) E. Gabriel and P. J. Diggle. Second-order analysis of inhomogeneous spatio-temporal point process data. Stat. Neerl., 63(1):43–51, 2009. doi: 10.1111/j.1467-9574.2008.00407.x.
Guan and Sherman (2007) Y. Guan and M. Sherman. On least squares fitting for stationary spatial point processes. J. R. Stat. Soc. Ser. B. Stat. Methodol., 69(1):31–49, 2007. doi: 10.1111/j.1467-9868.2007.00575.x.
Guillaumin et al. (2022) A. P. Guillaumin, A. M. Sykulski, S. C. Olhede, and F. J. Simons. The debiased spatial Whittle likelihood. J. R. Stat. Soc. Ser. B. Stat. Methodol., 84(4):1526–1557, 2022. doi: 10.1111/rssb.12539.
Guyon (1982) X. Guyon. Parameter estimation for a stationary process on a $d$ -dimensional lattice. Biometrika, 69(1):95–105, 1982. doi: 10.2307/2335857.
Hawkes (1971a) A. G. Hawkes. Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58(1):83–90, 1971a. doi: 10.2307/2334319.
Hawkes (1971b) A. G. Hawkes. Point spectra of some mutually exciting point processes. J. R. Stat. Soc. Ser. B. Stat. Methodol., 33(3):438–443, 1971b. doi: 10.1111/j.2517-6161.1971.tb01530.x.
Illian et al. (2008) J. Illian, A. Penttinen, H. Stoyan, and D. Stoyan. Statistical Analysis and Modelling of Spatial Point Patterns. John Wiley & Sons, Hoboken, NJ., 2008. doi: 10.1002/9780470725160.
Jovanović et al. (2015) S. Jovanović, J. Hertz, and S. Rotter. Cumulants of Hawkes point processes. Phys. Rev. E, 91(4):042802, 2015. doi: 10.1103/PhysRevE.91.042802.
Macchi (1975) O. Macchi. The coincidence approach to stochastic point processes. Adv. in Appl. Probab., 7(1):83–122, 1975. doi: 10.1017/s0001867800040313.
Matsuda and Yajima (2009) Y. Matsuda and Y. Yajima. Fourier analysis of irregularly spaced data on $R^{d}$ . J. R. Stat. Soc. Ser. B. Stat. Methodol., 71(1):191–217, 2009. doi: 10.1111/j.1467-9868.2008.00685.x.
Møller and Waagepetersen (2004) J. Møller and R. Waagepetersen. Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton, FL, 2004. doi: 10.1201/9780203496930.
Møller et al. (1998) J. Møller, A. R. Syversveen, and R. P. Waagepetersen. Log Gaussian Cox processes. Scand. J. Stat., 25(3):451–482, 1998. doi: 10.1111/1467-9469.00115.
Mugglestone and Renshaw (1996) M. A. Mugglestone and E. Renshaw. A practical guide to the spectral analysis of spatial point processes. Comput. Statist. Data Anal., 21(1):43–65, 1996. doi: 10.1016/0167-9473(95)00007-0.
Newey (1991) W. K. Newey. Uniform convergence in probability and stochastic equicontinuity. Econometrica, 59(4):1161–1167, 1991. doi: 10.2307/2938179.
Neyman and Scott (1958) J. Neyman and E. L. Scott. Statistical approach to problems of cosmology. J. R. Stat. Soc. Ser. B. Stat. Methodol., 20(1):1–29, 1958. doi: 10.1111/j.2517-6161.1958.tb00272.x.
Parzen (1957) E. Parzen. On consistent estimates of the spectrum of a stationary time series. Ann. Math. Stat., 28(2):329–348, 1957. doi: 10.1214/aoms/1177706962.
Pawlas (2009) Z. Pawlas. Empirical distributions in marked point processes. Stochastic Process. Appl., 119(12):4194–4209, 2009. doi: 10.1016/j.spa.2009.10.002.
Poinas and Lavancier (2023) A. Poinas and F. Lavancier. Asymptotic approximation of the likelihood of stationary determinantal point processes. Scand. J. Stat., 50(2):842–874, 2023. doi: 10.1111/sjos.12613.
Poinas et al. (2019) A. Poinas, B. Delyon, and F. Lavancier. Mixing properties and central limit theorem for associated point processes. Bernoulli, 25(3):1724–1754, 2019. doi: 10.3150/18-BEJ1033.
Prokešová and Jensen (2013) M. Prokešová and E. V. B. Jensen. Asymptotic Palm likelihood theory for stationary point processes. Ann. Inst. Statist. Math., 65(2):387–412, 2013. doi: 10.1007/s10463-012-0376-7.
Rajala et al. (2023) T. A. Rajala, S. C. Olhede, J. P. Grainger, and D. J. Murrell. What is the Fourier transform of a spatial point process? IEEE Trans. Inform. Theory, 69(8):5219–5252, 2023. doi: 10.1109/TIT.2023.3269514.
Rosenblatt (1956) M. Rosenblatt. A central limit theorem and a strong mixing condition. Proc. Natl. Acad. Sci. USA, 42(1):43–47, 1956. doi: 10.1073/pnas.42.1.43.
Shlomovich et al. (2022) L. Shlomovich, E. A. K. Cohen, N. Adams, and L. Patel. Parameter estimation of binned Hawkes processes. J. Comput. Graph. Statist., 31(4):990–1000, 2022. doi: 10.1080/10618600.2022.2050247.
Subba Rao (2018) S. Subba Rao. Statistical inference for spatial statistics defined in the Fourier domain. Ann. Statist., 46(2):469–499, 2018. doi: 10.1214/17-AOS1556.
Subba Rao and Yang (2021) S. Subba Rao and J. Yang. Reconciling the Gaussian and Whittle likelihood with an application to estimation in the frequency domain. Ann. Statist., 49(5):2774–2802, 2021. doi: 10.1214/21-AOS2055.
Tanaka et al. (2008) U. Tanaka, Y. Ogata, and D. Stoyan. Parameter estimation and model selection for Neyman-Scott point processes. Biom. J., 50(1):43–57, 2008. doi: 10.1002/bimj.200610339.
Tukey (1967) J. W. Tukey. An introduction to the calculations of numerical spectrum analysis. In Spectral Analysis of Time Series (Proc. Advanced Sem., Madison, Wis., 1966), pages 25–46. John Wiley, New York, 1967.
Waagepetersen and Guan (2009) R. Waagepetersen and Y. Guan. Two-step estimation for inhomogeneous spatial point processes. J. R. Stat. Soc. Ser. B. Stat. Methodol., 71(3):685–702, 2009. doi: 10.1111/j.1467-9868.2008.00702.x.
Warton and Shepherd (2010) D. I. Warton and L. C. Shepherd. Poisson point process models solve the “pseudo-absence problem” for presence-only data in ecology. Ann. Appl. Stat., 4(3):1383–1402, 2010. doi: 10.1214/10-aoas331.
Whittle (1953) P. Whittle. The analysis of multiple stationary time series. J. R. Stat. Soc. Ser. B. Stat. Methodol., 15:125–139, 1953. doi: 10.1111/j.2517-6161.1953.tb00131.x.
Yang and Guan (2024) J. Yang and Y. Guan. Supplement to “Fourier analysis of spatial point processes”. Technical report, 2024.
Zhu et al. (2023) L. Zhu, J. Yang, M. Jun, and S. Cook. On minimum contrast method for multivariate spatial point processes. arXiv preprint arXiv:2208.07044, 2023.
Zhuang et al. (2004) J. Zhuang, Y. Ogata, and D. Vere-Jones. Analyzing earthquake clustering features by using stochastic reconstruction. J. Geophys. Res. Solid Earth, 109(B05301), 2004. doi: 10.1029/2003JB002879.

Appendix A Proof of Theorem 4.1

A.1 Equivalence between the feasible and infeasible criteria

Let

\widetilde{A}_{h,n}(\phi)=\int_{D}\phi(\boldsymbol{\omega})I_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega},\quad n\in\mathbb{N},

(A.1)

be a feasible criterion of the integrated periodogram $\widehat{A}_{h,n}(\phi)$ as in (4.2). In theorem below, we show that $|D_{n}|^{1/2}(\widehat{A}_{h,n}(\phi)-A(\phi))$ and $|D_{n}|^{1/2}(\widetilde{A}_{h,n}(\phi)-A(\phi))$ are asymptotically equivalent. Therefore, both statistics share the same asymptotic distribution.

Theorem A.1.

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ . Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), and 3.4(i) hold. Then,

|D_{n}|^{1/2}(\widehat{A}_{h,n}(\phi)-A(\phi))-|D_{n}|^{1/2}(\widetilde{A}_{h,n}(\phi)-A(\phi))\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0,\quad n\rightarrow\infty,

where $\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}$ denotes convergences in $L_{2}$ .

Proof. Let $K_{h,n}(\boldsymbol{\omega})=c_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega})+c_{h,n}(-\boldsymbol{\omega})J_{h,n}(\boldsymbol{\omega})$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , and let

R_{1}(\boldsymbol{\omega})=-(\widehat{\lambda}_{h,n}-\lambda)K_{h,n}(\boldsymbol{\omega})~{}~{}\text{and}~{}~{}R_{2}(\boldsymbol{\omega})=|c_{h,n}(\boldsymbol{\omega})|^{2}(\widehat{\lambda}_{h,n}-\lambda)^{2},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

Then, we have $\widehat{I}_{h,n}(\boldsymbol{\omega})-I_{h,n}(\boldsymbol{\omega})=R_{1}(\boldsymbol{\omega})+R_{2}(\boldsymbol{\omega})$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Therefore, the difference between the feasible integrated periodogram and its theoretical counterpart can be expressed as

	$\displaystyle\|D_{n}\|^{1/2}(\widehat{A}_{h,n}(\phi)-\widetilde{A}_{h,n}(\phi))$	$\displaystyle=$	$\displaystyle\|D_{n}\|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(\widehat{I}_{h,n}(\boldsymbol{\omega})-I_{h,n}(\boldsymbol{\omega})\right)d\boldsymbol{\omega}$
		$\displaystyle=$	$\displaystyle S_{1}+S_{2},~{}~{}\boldsymbol{\omega}\in\mathbb{R}^{d},$

where $S_{i}=|D_{n}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})R_{i}(\boldsymbol{\omega})d\boldsymbol{\omega}$ , $i\in\{1,2\}$ . By using Theorem E.2 below, both $S_{1}$ and $S_{2}$ converges to zero in $L_{2}$ as $n\rightarrow\infty$ . Thus, we get the desired results. $\Box$

Thanks to the above theorem, it is enough to prove Theorem A.1 for $\widetilde{A}_{h,n}(\phi)$ replacing $\widehat{A}_{h,n}(\phi)$ in the statement.

A.2 Proof of the asymptotic bias

By using Theorem D.1 below, an expectation of the (theoretical) periodogram can be expressed as

\mathbb{E}[I_{h,n}(\boldsymbol{\omega})]=\int_{\mathbb{R}^{d}}f(\boldsymbol{x})F_{h,n}(\boldsymbol{\omega}-\boldsymbol{x})d\boldsymbol{x},\quad n\in\mathbb{N},\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

where $F_{h,n}$ is the Fejér Kernel defined as in (C.15). Therefore, applying Lemma C.3(b) to the above expression, we have

\mathbb{E}[I_{h,n}(\boldsymbol{\omega})]-f(\boldsymbol{\omega})=\int_{\mathbb{R}^{d}}f(\boldsymbol{x})F_{h,n}(\boldsymbol{\omega}-\boldsymbol{x})d\boldsymbol{x}-f(\boldsymbol{\omega})=O(|D_{n}|^{-2/d}),\quad n\rightarrow\infty

(A.2)

uniformly in $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Therefore, an expectation of $\widetilde{A}_{h,n}(\phi)-A(\phi)$ is bounded by

|\mathbb{E}[\widetilde{A}_{h,n}(\phi)]-A(\phi)|\leq\int_{D}|\phi(\boldsymbol{\omega})||\mathbb{E}[I_{h,n}(\boldsymbol{\omega})]-f(\boldsymbol{\omega})|d\boldsymbol{\omega}\leq C|D_{n}|^{-2/d}\int_{D}|\phi(\boldsymbol{\omega})|d\boldsymbol{\omega}=O(|D_{n}|^{-2/d})

as $n\rightarrow\infty$ . Thus, combining the above with Theorem A.1, we show (i). $\Box$

A.3 Proof of the asymptotic variance

To show the asymptotic variance, we first fix term. First, we view $\phi$ as a function on $\mathbb{R}^{d}$ by letting $\phi(\boldsymbol{\omega})=0$ when $\boldsymbol{\omega}\notin D$ . Since $\phi\in L^{1}(\mathbb{R}^{d})$ , let

\widehat{\phi}(\boldsymbol{\lambda})=\mathcal{F}(\phi)(\boldsymbol{\lambda})=\int_{\mathbb{R}^{d}}\phi(\boldsymbol{\omega})\exp(i\boldsymbol{\lambda}^{\top}\boldsymbol{\omega})d\boldsymbol{\omega},\quad\boldsymbol{\lambda}\in\mathbb{R}^{d}

(A.3)

be the Fourier transform of $\phi$ . Next, for a finite region $B_{M}=\prod_{i=1}^{d}[-M_{i},M_{i}]\subset\mathbb{R}^{d}$ , let

\phi_{M}(\boldsymbol{\lambda})=\mathcal{F}^{-1}(\widehat{\phi}(\boldsymbol{\omega})I_{B_{M}}(\boldsymbol{\omega}))(\boldsymbol{\lambda}),\quad\boldsymbol{\lambda}\in\mathbb{R}^{d},

(A.4)

where $I_{M}(\boldsymbol{x})=1$ if $\boldsymbol{x}\in B_{M}$ and zero otherwise. Therefore, the Fourier transform of $\phi_{M}(\boldsymbol{\lambda})$ , denoted $\widehat{\phi}_{M}(\boldsymbol{\omega})$ , is equal to $\widehat{\phi}(\boldsymbol{\omega})I_{B_{M}}(\boldsymbol{\omega})$ which vanishes outside $B_{M}$ . By using similar arguments as in Matsuda and Yajima (2009), Lemmas 4 and 5, we have $\phi_{M}\rightarrow\phi$ as $B_{M}\rightarrow\mathbb{R}^{d}$ in $L^{2}$ sense and for large enough $B_{M}$ , $|D_{n}|\mathrm{var}(\widetilde{A}_{h,n}(\phi))$ is closely approximated with $|D_{n}|\mathrm{var}(\widetilde{A}_{h,n}(\phi_{M}))$ uniformly for all $n\in\mathbb{N}$ .

Now, we make an expansion of $|D_{n}|\mathrm{var}(\widetilde{A}_{h,n}(\phi_{M}))$ . By using that $I_{h,n}(\boldsymbol{\omega})=|J_{h,n}(\boldsymbol{\omega})|^{2}=J_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega})$ , for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ , we have

$\displaystyle\mathrm{cov}(I_{h,n}(\boldsymbol{\omega}_{1}),I_{h,n}(\boldsymbol{\omega}_{2}))$	$\displaystyle=\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1})J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2})J_{h,n}(-\boldsymbol{\omega}_{2}))$	(A.5)
	$\displaystyle=\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))$
	$\displaystyle+\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle+\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}),J_{h,n}(-\boldsymbol{\omega}_{2})).$

Therefore, we have $|D_{n}|\mathrm{var}(\widetilde{A}_{h,n}(\phi_{M}))=A_{1}+A_{2}+A_{3}$ , where

$\displaystyle A_{1}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$
$\displaystyle A_{2}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$
$\displaystyle\text{and}\quad A_{3}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}),J_{h,n}(-\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

By using Theorem D.3 below, we have

\lim_{n\rightarrow\infty}A_{1}=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}f(\boldsymbol{\omega})^{2}\phi_{M}(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}.

(A.6)

Therefore, for sufficiently large $B_{M}$ , $\lim_{n\rightarrow\infty}A_{1}$ is arbitrary close to $(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{D}f^{2}\phi^{2}$ .

Similarly, the limit of $A_{2}$ is

	$\displaystyle\lim_{n\rightarrow\infty}A_{2}$	$\displaystyle=$	$\displaystyle(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}\phi_{M}(\boldsymbol{\omega})\phi_{M}(-\boldsymbol{\omega})f(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}$		(A.7)
		$\displaystyle\approx$	$\displaystyle(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{D}\phi(\boldsymbol{\omega})\phi(-\boldsymbol{\omega})f(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}.$		(A.7)

Lastly, by using Theorem D.4 below, we have

	$\displaystyle\lim_{n\rightarrow\infty}A_{3}$	$\displaystyle=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\lambda}_{1})\phi_{M}(\boldsymbol{\lambda}_{3})f_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}$		(A.8)
		$\displaystyle\approx(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{D^{2}}\phi(\boldsymbol{\lambda}_{1})\phi(\boldsymbol{\lambda}_{3})f_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}.$		(A.8)

Combining (A.6)–(A.8), we have

\lim_{n\rightarrow\infty}|D_{n}|\mathrm{var}(\widetilde{A}_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2}),

where $\Omega_{1}$ and $\Omega_{2}$ are defined as in (4.5). Thus, combining the aboves, we show (ii). $\Box$

A.4 Proof of the asymptotic normality

Because of Theorem A.1, it is enough to show the asymptotic normality of the feasible $G_{h,n}(\phi)=|D_{n}|^{1/2}(\widetilde{A}_{h,n}(\phi)-A(\phi))$ . Let $d\in\{1,2,3\}$ and let

\widetilde{G}_{h,n}(\phi)=|D_{n}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left\{I_{h,n}(\boldsymbol{\omega})-\mathbb{E}[I_{h,n}(\boldsymbol{\omega})]\right\}d\boldsymbol{\omega}=|D_{n}|^{1/2}\left(\widetilde{A}_{h,n}(\phi)-\mathbb{E}[\widetilde{A}_{h,n}(\phi)]\right).

(A.9)

Then, $G_{h,n}(\phi)-\widetilde{G}_{h,n}(\phi)$ is nonstochastic and is bounded by $O(|D_{n}|^{1/2-(2/d)})=o(1)$ as $n\rightarrow\infty$ due to Theorem 4.1(i). Therefore, $G_{h,n}(\phi)$ and $\widetilde{G}_{h,n}(\phi)$ are asymptotically equivalent.

Now, we focus on the asymptotic distribution of $\widetilde{G}_{h,n}(\phi)$ . Since $I_{h,n}(\cdot)$ cannot be written as an additive form of the periodograms of the sub-blocks, one cannot directly apply for the standard central limit theorem techniques that are reviewed in Biscio and Waagepetersen (2019), Section 1. Instead, we will “linearize” the periodogram and show that the associated linear term dominates.

Without loss of generality, we assume that there exists $C\in(1,\infty)$ such that $C^{-1}n^{d}\leq|D_{n}|\leq Cn^{d}$ , $n\in\mathbb{N}$ . Therefore, $A_{1},\cdots,A_{d}$ increases proportional to the order of $n$ . Next, let $\beta,\gamma\in(0,1)$ be chosen such that $2d/\varepsilon<\beta<\gamma<1$ , where $\varepsilon>2d$ is from Assumption 3.3(ii). Let

A_{n}=\{\boldsymbol{k}:\boldsymbol{k}\in n^{\gamma}\mathbb{Z}^{d}~{}~{}\text{and}~{}~{}D_{n}^{(\boldsymbol{k})}=\boldsymbol{k}+[-(n^{\gamma}-n^{\beta})/2,(n^{\gamma}-n^{\beta})/2]^{d}\subset D_{n}\},~{}~{}n\in\mathbb{N}.

Therefore, $\bigcup_{\boldsymbol{k}\in A_{n}}D_{n}^{(\boldsymbol{k})}$ is a disjoint union that is included in $D_{n}$ . Let $J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})-\mathbb{E}[\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})]$ , where

\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}^{(\boldsymbol{k})}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}^{(\boldsymbol{k})}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{k}\in A_{n},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

Therefore, $\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\cdot)$ is the DFT evaluated within the sub-block $D_{n}^{(\boldsymbol{k})}$ of $D_{n}$ . Let

\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)=|D_{n}^{(\boldsymbol{k})}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(|J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})|^{2}-\mathbb{E}[|J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})|^{2}]\right)d\boldsymbol{\omega},\quad n\in\mathbb{N},\quad\boldsymbol{k}\in A_{n}.

Let $k_{n}=|A_{n}|$ and let

V_{h,n}(\phi)=k_{n}^{-1/2}\sum_{\boldsymbol{k}\in A_{n}}\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi),\quad n\in\mathbb{N}.

(A.10)

In Theorem F.3 below, we show that $\widetilde{G}_{h,n}(\phi)$ and $V_{h,n}(\phi)$ are asymptotically equivalent. An advantage of using $V_{h,n}(\phi)$ over $\widetilde{G}_{h,n}(\phi)$ is that $V_{h,n}(\phi)$ is written in terms of the sum of $\{\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)\}_{\boldsymbol{k}\in A_{n}}$ which are based on the statistics on the non-overlapping sub-blocks of $D_{n}$ . Therefore, one can show the $\alpha$ -mixing CLT for $V_{h,n}(\phi)$ using the standard independent and telescoping sum techniques (cf. Guan and Sherman (2007)). Details can be found in Appendix F.2 (Theorem F.5). This together with Theorem 4.1(i) and (ii), we get the desired results. $\Box$

Appendix B Additional proofs of the main results

B.1 Proof of Theorem 3.1

Below we show that the feasible DFT $\widehat{J}_{h,n}(\boldsymbol{\omega})$ is asymptotically equivalent to its theoretical counterpart $J_{h,n}(\boldsymbol{\omega})$ as in (2.11).

Theorem B.1.

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ and let $h$ be the data taper such that $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}h(\boldsymbol{\omega})<\infty$ . Suppose that Assumptions 3.1 and 3.2(for $\ell=2$ ) hold. Let $\{\boldsymbol{\omega}_{n}\}$ be a sequence on $\mathbb{R}^{d}$ that is asymptotically distant from $\{\textbf{0}\}$ . Then,

\widehat{J}_{h,n}(\boldsymbol{\omega}_{n})-J_{h,n}(\boldsymbol{\omega}_{n})\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0,\quad n\rightarrow\infty.

Proof. By definition, $\widehat{J}_{h,n}(\boldsymbol{\omega}_{n})-J_{h,n}(\boldsymbol{\omega}_{n})=-(\widehat{\lambda}_{h,n}-\lambda)c_{h,n}(\boldsymbol{\omega}_{n})$ . By using Lemma E.1(b) below, we have $\mathbb{E}[|\widehat{\lambda}_{h,n}-\lambda|^{2}]=\mathrm{var}(\widehat{\lambda}_{h,n})\leq C|D_{n}|^{-1}$ for some $C\in(0,\infty)$ . Therefore, $\mathbb{E}[|\widehat{J}_{h,n}(\boldsymbol{\omega}_{n})-J_{h,n}(\boldsymbol{\omega}_{n})|^{2}]\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega}_{n})|^{2}$ . Next, by using (2.10) and Lemma C.2 below, we have
$\lim_{n\rightarrow\infty}|D_{n}|^{-1/2}|c_{h,n}(\boldsymbol{\omega}_{n})|=0$ . Thus, we get the desired result. $\Box$

Now we are ready to prove Theorem 3.1. Thanks to the above theorem, it is enough to prove the theorem for $J_{h,n}(\boldsymbol{\omega}_{n})$ replacing $\widehat{J}_{h,n}(\boldsymbol{\omega}_{n})$ in the statement. First, we will show (3.5). Recall $H_{h,k}^{(n)}$ in (2.7). By using Theorem D.2 below, the leading term of $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(\boldsymbol{\omega}_{2,n}))$ is $|D_{n}|^{-1}H_{h,2}^{-1}f(\boldsymbol{\omega}_{1,n})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1,n}-\boldsymbol{\omega}_{2,n})$ . Therefore, by using Lemma C.2 below, we have

	$\displaystyle\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(\boldsymbol{\omega}_{2,n}))$	$\displaystyle=$	$\displaystyle\|D_{n}\|^{-1}H_{h,2}^{-1}f(\boldsymbol{\omega}_{1,n})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1,n}-\boldsymbol{\omega}_{2,n})+o(1)$
		$\displaystyle\leq$	$\displaystyle H_{h,2}^{-1}\big{(}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}f(\boldsymbol{\omega})\big{)}o(1)+o(1),\quad n\rightarrow\infty.$

Since Assumption 3.2(for $\ell=2$ ) implies $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}f(\boldsymbol{\omega})<\infty$ , by taking a limit on each side above, we show (3.5).

Next, we will show (3.6). Using Theorem D.2 again together with $H_{h,2}^{(n)}(\textbf{0})=|D_{n}|H_{h,2}$ , we have

\displaystyle\mathrm{var}(J_{h,n}(\boldsymbol{\omega}_{n}))=\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{n}),J_{h,n}(\boldsymbol{\omega}_{n}))=f(\boldsymbol{\omega}_{n})+o(1),\quad n\rightarrow\infty,

(B.1)

where $o(1)$ error above is uniform in $\boldsymbol{\omega}_{n}\in\mathbb{R}^{d}$ . Since $f$ is continuous, provided Assumption 3.2 for $\ell=2$ , the right hand side above converges to $f(\boldsymbol{\omega})$ as $n\rightarrow\infty$ . Thus, we show (3.6).

Lastly, to show (3.7), by using an expansion (A.5) together with (3.5) and (3.6), we have

$\displaystyle\mathrm{cov}(I_{h,n}(\boldsymbol{\omega}_{1,n}),I_{h,n}(\boldsymbol{\omega}_{2,n}))$	$\displaystyle=$	$\displaystyle\|\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(\boldsymbol{\omega}_{2,n}))\|^{2}+\|\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(-\boldsymbol{\omega}_{2,n}))\|^{2}$
		$\displaystyle~{}~{}+\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(-\boldsymbol{\omega}_{1,n}),J_{h,n}(\boldsymbol{\omega}_{2,n}),J_{h,n}(-\boldsymbol{\omega}_{2,n}))$
	$\displaystyle=$	$\displaystyle f(\boldsymbol{\omega})^{2}I(\boldsymbol{\omega}_{1,n}=\boldsymbol{\omega}_{2,n})+o(1)$
		$\displaystyle~{}+\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1,n}),J_{h,n}(-\boldsymbol{\omega}_{1,n}),J_{h,n}(\boldsymbol{\omega}_{2,n}),J_{h,n}(-\boldsymbol{\omega}_{2,n}))$

as $n\rightarrow\infty$ . The last term above is $O(|D_{n}|^{-1})$ as $n\rightarrow\infty$ due to Lemma D.5 below. Therefore, we have $\lim_{n\rightarrow\infty}\mathrm{cov}(I_{h,n}(\boldsymbol{\omega}_{1,n}),I_{h,n}(\boldsymbol{\omega}_{2,n}))=0$ and $\lim_{n\rightarrow\infty}\mathrm{var}(I_{h,n}(\boldsymbol{\omega}_{n}))=f(\boldsymbol{\omega})^{2}$ . This proves (3.7). All together, we prove the theorem. $\Box$

B.2 Proof of Theorem 3.2

Let $\Re J_{h,n}(\boldsymbol{\omega})$ and $\Im J_{h,n}(\boldsymbol{\omega})$ be the real and imaginary parts of $J_{h,n}(\boldsymbol{\omega})$ , respectively. Then, by using Theorem B.1 above, it is enough to show

\left(\frac{\Re J_{h,n}(\boldsymbol{\omega}_{1,n})}{(f(\boldsymbol{\omega}_{1})/2)^{1/2}},\frac{\Im J_{h,n}(\boldsymbol{\omega}_{1,n})}{(f(\boldsymbol{\omega}_{1})/2)^{1/2}},\dots,\frac{\Im J_{h,n}(\boldsymbol{\omega}_{r,n})}{(f(\boldsymbol{\omega}_{r})/2)^{1/2}}\right)^{\top}\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{SN}_{2r},\quad n\rightarrow\infty,

(B.2)

where $\mathcal{SN}_{2r}$ is the $2r$ -dimensional standard normal random variable.

To show (B.2), we will first show that the asymptotic variance of the left hand side of above is an unit matrix. Note that

\Re J_{h,n}(\boldsymbol{\omega})=\frac{1}{2}(J_{h,n}(\boldsymbol{\omega})+J_{h,n}(-\boldsymbol{\omega}))\quad\text{and}\quad\Im J_{h,n}(\boldsymbol{\omega})=\frac{1}{2i}(J_{h,n}(\boldsymbol{\omega})-J_{h,n}(-\boldsymbol{\omega})),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

Therefore, for $i,j\in\{1,\dots,r\}$ , we have

	$\displaystyle\mathrm{cov}(\Re J_{h,n}(\boldsymbol{\omega}_{i,n}),\Re J_{h,n}(\boldsymbol{\omega}_{j,n}))$	$\displaystyle=$	$\displaystyle\frac{1}{4}\big{(}\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{i,n}),J_{h,n}(\boldsymbol{\omega}_{j,n}))+\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{i,n}),J_{h,n}(-\boldsymbol{\omega}_{j,n}))$
			$\displaystyle~{}~{}+\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{i,n}),J_{h,n}(\boldsymbol{\omega}_{j,n}))+\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{i,n}),J_{h,n}(-\boldsymbol{\omega}_{j,n}))\big{)}.$

Therefore, by using Theorem 3.1, one can show

\lim_{n\rightarrow\infty}\mathrm{cov}(\Re J_{h,n}(\boldsymbol{\omega}_{i,n}),\Re J_{h,n}(\boldsymbol{\omega}_{j,n}))=\frac{1}{2}f(\boldsymbol{\omega}_{i})I(i=j).

Similarly, for $i,j\in\{1,\dots,r\}$ ,

$\displaystyle\lim_{n\rightarrow\infty}\mathrm{cov}(\Re J_{h,n}(\boldsymbol{\omega}_{i,n}),\Im J_{h,n}(\boldsymbol{\omega}_{j,n}))$	$\displaystyle=$	$\displaystyle 0,$
$\displaystyle\lim_{n\rightarrow\infty}\mathrm{cov}(\Im J_{h,n}(\boldsymbol{\omega}_{i,n}),\Re J_{h,n}(\boldsymbol{\omega}_{j,n}))$	$\displaystyle=$	$\displaystyle 0,$
$\displaystyle\text{and}\quad\lim_{n\rightarrow\infty}\mathrm{cov}(\Im J_{h,n}(\boldsymbol{\omega}_{i,n}),\Im J_{h,n}(\boldsymbol{\omega}_{j,n}))$	$\displaystyle=$	$\displaystyle\frac{1}{2}f(\boldsymbol{\omega}_{i})I(i=j).$

All together, we show that the limiting variance of the left hand side of (B.2) is a unit matrix.

Next, let $\{a_{j}\}_{j=1}^{r},\{b_{j}\}_{j=1}^{r}\in\mathbb{R}$ . We define $\mathcal{Z}_{n}=|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}}g_{n}(\boldsymbol{x})$ , where

g_{n}(\boldsymbol{x})=\sum_{j=1}^{r}a_{j}h(\boldsymbol{x}/\boldsymbol{A})\frac{1}{2}(e^{-\boldsymbol{x}^{\top}\boldsymbol{\omega}_{j,n}}+e^{\boldsymbol{x}^{\top}\boldsymbol{\omega}_{j,n}})+\sum_{j=1}^{r}b_{j}h(\boldsymbol{x}/\boldsymbol{A})\frac{1}{2i}(e^{-\boldsymbol{x}^{\top}\boldsymbol{\omega}_{j,n}}-e^{\boldsymbol{x}^{\top}\boldsymbol{\omega}_{j,n}}).

(B.3)

Then, it is easily seen that any linear combination of the left hand side of (B.2) can be expressed as $\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]$ for an appropriate $g_{n}$ . Therefore, thanks to Cramér-Wald device, to show the asymptotic normality in (B.2), it is enough to show that $\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]$ is asymptotically normal. CLT for $\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]$ under $\alpha$ -mixing condition can be easily seen using standard techniques (cf. Guan and Sherman (2007)). One thing that is needed to verify is to show there exists $\delta>0$ such that

\sup_{n\in\mathbb{N}}\mathbb{E}\left|\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]\right|^{2+\delta}<\infty.

(B.4)

We will show the above holds for $\delta=2$ , provided Assumption 3.2(i) for $\ell=4$ . We note that

\mathbb{E}\left|\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]\right|^{4}=\mathbb{E}(\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}])^{4}=\kappa_{4}(\mathcal{Z}_{n})+3\kappa_{2}(\mathcal{Z}_{n})^{2},

(B.5)

where $\kappa_{2}(X)=\mathrm{cum}(X,X)$ and $\kappa_{4}(X)=\mathrm{cum}(X,X,X,X)$ . Therefore, since $g_{n}(\boldsymbol{x})$ in (B.3) is bounded above uniformly in $n\in\mathbb{N}$ and $\boldsymbol{x}\in\mathbb{R}^{d}$ , we can apply Lemma D.5 and get

\sup_{n\in\mathbb{N}}\mathbb{E}\left|\mathcal{Z}_{n}-\mathbb{E}[\mathcal{Z}_{n}]\right|^{4}\leq\sup_{n\in\mathbb{N}}|\kappa_{4}(\mathcal{Z}_{n})|+3\sup_{n\in\mathbb{N}}\kappa_{2}(\mathcal{Z}_{n})^{2}=O(1).

Thus, we show (B.4) for $\delta=2$ . The remaining parts for proving CLT are routine (which may be obtained from the authors upon request). $\Box$

B.3 Proof of Theorem 3.3

Let

\widetilde{f}_{n,b}(\boldsymbol{\omega})=\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})I_{h,n}(\boldsymbol{x})d\boldsymbol{x},\quad n\in\mathbb{N},\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(B.6)

be the theoretical counterpart of the kernel spectral density estimator. Then, in Corollary E.1 below, we show that $\widetilde{f}_{n,b}(\boldsymbol{\omega})$ and $\widehat{f}_{n,b}(\boldsymbol{\omega})$ are asymptotically equivalent. Therefore, it is enough to show that $\widetilde{f}_{n,b}(\boldsymbol{\omega})$ consistently estimates the spectral density $f(\boldsymbol{\omega})$ for all $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . By using (B.1) together with $\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})d\boldsymbol{x}=1$ , we have

\left|\mathbb{E}[\widetilde{f}_{n,b}(\boldsymbol{\omega})]-\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})f(\boldsymbol{x})d\boldsymbol{x}\right|\leq\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})o(1)d\boldsymbol{x}=o(1),\quad n\rightarrow\infty.

Moreover, by using classcial kernel method (cf. Ding et al. (2024), proof of Theorem 5.1), it can be easily seen that $|\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})f(\boldsymbol{x})d\boldsymbol{x}-f(\boldsymbol{\omega})|=o(1)$ as $b+|D_{n}|^{-1}b^{-d}\rightarrow\infty$ . Therefore, by using triangular inequality, we have

\lim_{n\rightarrow\infty}\big{|}\mathbb{E}[\widetilde{f}_{n,b}(\boldsymbol{\omega})]-f(\boldsymbol{\omega})\big{|}=0.

(B.7)

Next, by using a similar argument as in Appendix A.3 (see also, the proof of Ding et al. (2024), Theorem 5.1), it can be seen that the variance of $\widetilde{f}_{n,b}(\boldsymbol{\omega})$ is bounded by

\mathrm{var}(\widetilde{f}_{n,b}(\boldsymbol{\omega}))\leq C|D_{n}|^{-1}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})^{2}d\boldsymbol{x}=O(|D_{n}|^{-1}b^{-d})=o(1),\quad n\rightarrow\infty.

(B.8)

We mention that unlike the case of Appendix A.3, we do not require Assumption 4.2 to prove (B.8). This is because in the expansion of $\mathrm{var}(\widetilde{f}_{n,b}(\boldsymbol{\omega}))$ using (A.5), the fourth-order cumulant term is bounded by $|D_{n}|^{-1}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{x})^{2}d\boldsymbol{x}$ due to Lemma D.5 below. Therefore, combining (B.7) and (B.8), we have $\widetilde{f}_{n,b}(\boldsymbol{\omega})\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}f(\boldsymbol{x})$ as $n\rightarrow\infty$ . This proves the theorem. $\Box$

B.4 Proof of Theorem 4.2

By using Corollary E.1 below, $\sqrt{|D_{n}|b^{d}}(\widetilde{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega}))$ and $\sqrt{|D_{n}|b^{d}}(\widehat{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega}))$ share the same asymptotic distribution. Therefore, it is enough to show the asymptotic normality of $\sqrt{|D_{n}|b^{d}}(\widetilde{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega}))$ . Since the scaled kernel function $W_{b}$ has a support on $[-b/2,b/2]^{d}$ , $\widetilde{f}_{n,b}(\boldsymbol{\omega})$ can be written as an (theoretical) integrated periodogram by setting $\phi_{b}(\boldsymbol{x})=W_{b}(\boldsymbol{\omega}-\boldsymbol{x})$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ . Therefore, the proof of the asymptotic normality is almost identical to that of the proof of Theorem 4.1. We will only sketch the proof.

The bias. By using Lemma C.3(b), we have

\bigg{|}\mathbb{E}[\widetilde{f}_{n,b}(\boldsymbol{\omega})]-\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})f(\boldsymbol{x})d\boldsymbol{x}\bigg{|}=\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})O(|D_{n}|^{-2/d})d\boldsymbol{x}=O(|D_{n}|^{-2/d}),\quad n\rightarrow\infty.

Moreover, by using classical nonparametric kernel estimation results (cf. Ding et al. (2024), Theorem 5.2), we have $|\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})f(\boldsymbol{x})d\boldsymbol{x}-f(\boldsymbol{\omega})|=O(b^{2})$ , $n\rightarrow\infty$ . Therefore,
$\lim_{n\rightarrow\infty}\sqrt{|D_{n}|b^{d}}\big{|}\mathbb{E}[\widetilde{f}_{n,b}(\boldsymbol{\omega})]-f(\boldsymbol{\omega})\big{|}=0$ , provided $\lim_{n\rightarrow\infty}|D_{n}|^{1/2}b^{d/2}\{|D_{n}|^{-2/d}+b^{2}\}=0$ .

The variance. By using a similar argument to prove Theorem 4.1(ii), one can get

|D_{n}|b^{d}\mathrm{var}(\widetilde{f}_{n,b}(\boldsymbol{\omega}))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})b^{d}(\Omega_{b,1}+\Omega_{b,2})+o(1),\quad n\rightarrow\infty,

where

	$\displaystyle\Omega_{b,1}$	$\displaystyle=$	$\displaystyle\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})\{W_{b}(\boldsymbol{\omega}-\boldsymbol{x})+W_{b}(\boldsymbol{\omega}+\boldsymbol{x})\}f(\boldsymbol{x})^{2}d\boldsymbol{x}$
	$\displaystyle\text{and}\quad\Omega_{b,2}$	$\displaystyle=$	$\displaystyle\int_{\mathbb{R}^{2d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{1})W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{2})f_{4}(\boldsymbol{x}_{1},-\boldsymbol{x}_{1},\boldsymbol{x}_{2})d\boldsymbol{x}_{1}d\boldsymbol{x}_{2}.$

Now, we calculate the limit of $\Omega_{b,i}$ for $i\in\{1,2\}$ . We note that $b^{d}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})^{2}=b^{-d}(W^{2})(b^{-1}(\boldsymbol{\omega}-\boldsymbol{x}))$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ . Therefore, by treating $W^{2}$ as a new kernel function in the convolution equation and using that $\int_{\mathbb{R}^{d}}b^{-d}(W^{2})(b^{-1}\boldsymbol{x})d\boldsymbol{x}=\int_{\mathbb{R}^{d}}W(\boldsymbol{x})^{2}d\boldsymbol{x}=W_{2}$ , it is easily seen that

\lim_{n\rightarrow\infty}b^{d}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})^{2}f(\boldsymbol{x})^{2}d\boldsymbol{x}=W_{2}f(\boldsymbol{\omega})^{2}.

Similarly, one can show that

\lim_{n\rightarrow\infty}b^{d}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x})W_{b}(\boldsymbol{\omega}+\boldsymbol{x})f(\boldsymbol{x})^{2}d\boldsymbol{x}=\begin{cases}W_{2}f(\boldsymbol{\omega})^{2},&\boldsymbol{\omega}=\textbf{0}.\\ 0,&\boldsymbol{\omega}\in\mathbb{R}^{d}\backslash\{\textbf{0}\}.\end{cases}

To calculate the limit of $\Omega_{b,2}$ , we note that

	$\displaystyle b^{d}\int_{\mathbb{R}^{2d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{1})W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{2})f_{4}(\boldsymbol{x}_{1},-\boldsymbol{x}_{1},\boldsymbol{x}_{2})d\boldsymbol{x}_{1}d\boldsymbol{x}_{2}$
	$\displaystyle=b^{d}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{2})\left(\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{1})f_{4}(\boldsymbol{x}_{1},-\boldsymbol{x}_{1},\boldsymbol{x}_{2})d\boldsymbol{x}_{1}\right)d\boldsymbol{x}_{2}$
	$\displaystyle=b^{d}\int_{\mathbb{R}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{x}_{2})\left(f_{4}(\boldsymbol{\omega},-\boldsymbol{\omega},\boldsymbol{x}_{2})+o(1)\right)d\boldsymbol{x}_{2}=O(b^{d}),\quad n\rightarrow\infty.$

Therefore, $\lim_{n\rightarrow\infty}b^{d}\Omega_{2,b}=0$ . All together, we have

\lim_{n\rightarrow\infty}|D_{n}|b^{d}\mathrm{var}(\widetilde{f}_{n,b}(\boldsymbol{\omega}))=\begin{cases}2(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})W_{2}f(\boldsymbol{\omega})^{2},&\boldsymbol{\omega}=\textbf{0}.\\ (2\pi)^{d}(H_{h,4}/H_{h,2}^{2})W_{2}f(\boldsymbol{\omega})^{2},&\boldsymbol{\omega}\in\mathbb{R}^{d}\backslash\{\textbf{0}\}.\end{cases}

The asymptotic normality. Proof of the asymptotic normality of $\sqrt{|D_{n}|b^{d}}(\widetilde{f}_{n,b}(\boldsymbol{\omega})-f(\boldsymbol{\omega}))$ is almost identical with the proof of the asymptotic normality of $\widetilde{G}_{h,n}(\phi)$ in (A.9) (we omit the details). All together, we prove the theorem. $\Box$

B.5 Proof of Theorem 6.1

First, we will show the consistency of $\widehat{\boldsymbol{\theta}}_{n}$ . Recall $\mathcal{L}(\boldsymbol{\theta})$ in (6.3). We will first show the uniform convergence of $L_{n}(\cdot)$ , that is,

\sup_{\boldsymbol{\theta}\in\Theta}|L_{n}(\boldsymbol{\theta})-\mathcal{L}(\boldsymbol{\theta})|\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}0.

(B.9)

By using Theorem 4.1(i) together with uniform boundedness of $f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega})$ , we have
$\sup_{\boldsymbol{\theta}\in\Theta}|\mathbb{E}[L_{n}(\boldsymbol{\theta})]-\mathcal{L}(\boldsymbol{\theta})|=o(1)$ , $n\rightarrow\infty$ . Next, since $\mathrm{var}(L_{n}(\boldsymbol{\theta}))=O(|D_{n}|^{-1})$ as $n\rightarrow\infty$ due to argument in Appendix A.3, we have $L_{n}(\boldsymbol{\theta})-\mathbb{E}[L_{n}(\boldsymbol{\theta})]\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}0$ for each $\boldsymbol{\theta}\in\Theta$ . Therefore, to show (B.9), it is enough to show that $\{L_{n}(\boldsymbol{\theta}):\boldsymbol{\theta}\in\Theta\}$ is stochastic equicontinuous (Newey (1991), Theorem 2.1).

Let $\delta>0$ and we choose $\boldsymbol{\theta}_{1},\boldsymbol{\theta}_{2}\in\Theta$ such that $\|\boldsymbol{\theta}_{1}-\boldsymbol{\theta}_{2}\|\leq\delta$ . Then, since $f_{\boldsymbol{\theta}}^{-1}(\boldsymbol{\omega})$ and $\log f_{\boldsymbol{\theta}}$ has a first order derivative with respect to $\boldsymbol{\theta}$ which are continuous on the compact domain $\Theta\times D$ , there exist $C_{1},C_{2}\in(0,\infty)$ such that

|f_{\boldsymbol{\theta}_{1}}^{-1}(\boldsymbol{\omega})-f_{\boldsymbol{\theta}_{2}}^{-1}(\boldsymbol{\omega})|\leq C_{1}\delta\quad\text{and}|\log f_{\boldsymbol{\theta}_{1}}(\boldsymbol{\omega})-\log f_{\boldsymbol{\theta}_{2}}(\boldsymbol{\omega})|\leq C_{2}\delta,~{}~{}\boldsymbol{\omega}\in D.

Therefore, for arbitrary $\boldsymbol{\theta}_{1},\boldsymbol{\theta}_{2}\in\Theta$ with $\|\boldsymbol{\theta}_{1}-\boldsymbol{\theta}_{2}\|\leq\delta$

|L_{n}(\boldsymbol{\theta}_{1})-L_{n}(\boldsymbol{\theta}_{2})|\leq\delta\left(C_{1}\int_{D}\widehat{I}_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega}+C_{2}|D|\right),~{}~{}n\rightarrow\infty.

Using Theorem 4.1(i,ii) and the term in the bracket above is $O_{p}(1)$ as $n\rightarrow\infty$ uniformly over $\boldsymbol{\theta}_{1},\boldsymbol{\theta}_{2}\in\Theta$ . Therefore, $\{L_{n}(\boldsymbol{\theta}):\boldsymbol{\theta}\in\Theta\}$ is stochastic equicontinuous and, in turn, we show the uniform convergence (B.9).

Next, recall $\widehat{\boldsymbol{\theta}}_{n}$ and $\boldsymbol{\theta}_{0}$ are minimizers of $L_{n}$ and $\mathcal{L}$ , respectively. Then, by using (B.9),

	$\displaystyle 0\leq\mathcal{L}(\widehat{\boldsymbol{\theta}}_{n})-\mathcal{L}(\boldsymbol{\theta}_{0})$	$\displaystyle\leq$	$\displaystyle(\mathcal{L}(\widehat{\boldsymbol{\theta}}_{n})-L_{n}(\widehat{\boldsymbol{\theta}}_{n}))+(L_{n}(\widehat{\boldsymbol{\theta}}_{n})-L_{n}(\boldsymbol{\theta}_{0}))+(L_{n}(\boldsymbol{\theta}_{0})-\mathcal{L}(\boldsymbol{\theta}_{0}))$
		$\displaystyle\leq$	$\displaystyle 2\sup_{\boldsymbol{\theta}\in\Theta}\|\mathcal{L}(\boldsymbol{\theta})-L_{n}(\boldsymbol{\theta})\|.$

Therefore, we have $\mathcal{L}(\widehat{\boldsymbol{\theta}}_{n})-\mathcal{L}(\boldsymbol{\theta}_{0})\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}0$ and since, by assumption, $\boldsymbol{\theta}_{0}$ is the unique minimizer of $\mathcal{L}$ , we have $\widehat{\boldsymbol{\theta}}_{n}\stackrel{{\scriptstyle\mathcal{P}}}{{\rightarrow}}\boldsymbol{\theta}_{0}$ . This proves (6.5).

Next, we show the asymptotic normality of $\widehat{\boldsymbol{\theta}}_{n}$ . By using a Taylor expansion and using that $(\partial L_{n}/\partial\boldsymbol{\theta})(\widehat{\boldsymbol{\theta}}_{n})=0$ , there exists $\widetilde{\boldsymbol{\theta}}_{n}$ , a convex combination of $\widehat{\boldsymbol{\theta}}_{n}$ and $\boldsymbol{\theta}_{0}$ , such that

|D_{n}|^{1/2}(\widehat{\boldsymbol{\theta}}_{n}-\boldsymbol{\theta}_{0})=\left(\nabla^{2}L_{n}(\widetilde{\boldsymbol{\theta}}_{n})\right)^{-1}\left(-|D_{n}|^{1/2}\nabla L_{n}(\boldsymbol{\theta}_{0})\right)=P_{n}(\widetilde{\boldsymbol{\theta}}_{n})^{-1}Q_{n}(\boldsymbol{\theta}_{0}).

We first focus on $P_{n}(\widetilde{\boldsymbol{\theta}}_{n})$ . By simple algebra, we have

	$\displaystyle P_{n}(\boldsymbol{\theta})=\nabla^{2}L_{n}(\boldsymbol{\theta})$	$\displaystyle=$	$\displaystyle 2\int_{D}(\nabla\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega}))(\nabla\log f_{\boldsymbol{\theta}}(\boldsymbol{\omega}))^{\top}\frac{1}{f_{\boldsymbol{\theta}}(\boldsymbol{\omega})}\left(\widehat{I}_{h,n}(\boldsymbol{\omega})-f(\boldsymbol{\omega})\right)d\boldsymbol{\omega}$
			$\displaystyle~{}-\int_{D}\frac{1}{f_{\boldsymbol{\theta}}(\boldsymbol{\omega})^{2}}\nabla^{2}f_{\boldsymbol{\theta}}(\boldsymbol{\omega})\left(\widehat{I}_{h,n}(\boldsymbol{\omega})-f(\boldsymbol{\omega})\right)d\boldsymbol{\omega}+2(2\pi)^{d}\Gamma(\boldsymbol{\theta}),~{}~{}\boldsymbol{\theta}\in\Theta.$

By using the (uniform) continuity of $f_{\boldsymbol{\theta}}^{-1}$ , $\nabla\log f_{\boldsymbol{\theta}}$ , $\nabla^{2}f^{-1}_{\boldsymbol{\theta}}$ , and $\nabla^{2}f_{\boldsymbol{\theta}}$ , and since $\widetilde{\boldsymbol{\theta}}_{n}$ is also a consistent estimator of $\boldsymbol{\theta}_{0}$ , we have $P_{n}(\widetilde{\boldsymbol{\theta}}_{n})-P_{n}(\boldsymbol{\theta}_{0})=o_{p}(1)$ as $n\rightarrow\infty$ . Next, by using Theorems 4.1, the first two terms in $P_{n}(\boldsymbol{\theta}_{0})$ is $o_{p}(1)$ as $n\rightarrow\infty$ . Therefore, we have

P_{n}(\widetilde{\boldsymbol{\theta}}_{n})=2(2\pi)^{d}\Gamma(\boldsymbol{\theta}_{0})+o_{p}(1),\quad n\rightarrow\infty.

(B.10)

Next, we focus on $Q_{n}(\boldsymbol{\theta}_{0})$ . By simple algebra, we have

	$\displaystyle Q_{n}(\boldsymbol{\theta}_{0})$
	$\displaystyle=-\|D_{n}\|^{1/2}\int_{D}\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega})\left(\widehat{I}_{h,n}(\boldsymbol{\omega})-f_{\boldsymbol{\theta}_{0}}(\boldsymbol{\omega})\right)d\boldsymbol{\omega}$
	$\displaystyle=-\|D_{n}\|^{1/2}\int_{D}\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega})\left(f(\boldsymbol{\omega})-f_{\boldsymbol{\theta}_{0}}(\boldsymbol{\omega})\right)d\boldsymbol{\omega}-\|D_{n}\|^{1/2}\int_{D}\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega})\left[\widehat{I}_{h,n}(\boldsymbol{\omega})-f(\boldsymbol{\omega})\right]d\boldsymbol{\omega}.$

The first term above is zero since $\nabla\mathcal{L}(\boldsymbol{\theta}_{0})=0$ and by using Theorem 4.1 with a help of Cramér-Wald device, the second term is asymptotically centered normal with variance
$(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}(\boldsymbol{\theta}_{0})+\Omega_{2}(\boldsymbol{\theta}_{0}))$ where

	$\displaystyle\Omega_{1}(\boldsymbol{\theta}_{0})$	$\displaystyle=$	$\displaystyle 2\int_{D}(\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega}))(\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega}))^{\top}f(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}=4(2\pi)^{d}S_{1}(\boldsymbol{\theta}_{0})\quad\text{and}$
	$\displaystyle\Omega_{2}(\boldsymbol{\theta}_{0})$	$\displaystyle=$	$\displaystyle\int_{D^{2}}(\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega}))(\nabla f_{\boldsymbol{\theta}_{0}}^{-1}(\boldsymbol{\omega}))^{\top}f_{4}(\boldsymbol{\omega}_{1},-\boldsymbol{\omega}_{1},\boldsymbol{\omega})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}=4(2\pi)^{d}S_{2}(\boldsymbol{\theta}_{0}).$

Therefore, we conclude,

Q_{n}(\boldsymbol{\theta}_{0})\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(\textbf{0},4(2\pi)^{2d}(H_{h,4}/H_{h,2}^{2})(S_{1}(\boldsymbol{\theta}_{0})+S_{2}(\boldsymbol{\theta}_{0}))\right).

(B.11)

Combining (B.10) and (B.11) and by using continuous mapping theorem, we have

	$\displaystyle\|D_{n}\|^{1/2}(\widehat{\boldsymbol{\theta}}_{n}-\boldsymbol{\theta}_{0})$	$\displaystyle=$	$\displaystyle P_{n}(\widetilde{\boldsymbol{\theta}}_{n})^{-1}Q_{n}(\boldsymbol{\theta}_{0})$
		$\displaystyle\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}$	$\displaystyle\mathcal{N}\left(\textbf{0},(H_{h,4}/H_{h,2}^{2})\Gamma(\boldsymbol{\theta}_{0})^{-1}(S_{1}(\boldsymbol{\theta}_{0})+S_{2}(\boldsymbol{\theta}_{0}))\Gamma(\boldsymbol{\theta}_{0})^{-1}\right)),\quad n\rightarrow\infty.$

Thus, we show (6.6). All together, we get the desired results. $\Box$

Appendix C Representations and approximations of the Fourier transform of the data taper

Let $D_{n}$ has a form in (2.6). Recall $H_{h,k}^{(n)}(\boldsymbol{\omega})$ in (2.7)

H_{h,k}^{(n)}(\boldsymbol{\omega})=\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{k}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x},\quad k,n\in\mathbb{N},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

For two data taper functions $f,g$ with support $[-1/2,1/2]^{d}$ , we define

R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})=\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})\left(g(\frac{\boldsymbol{x}+\boldsymbol{t}}{\boldsymbol{A}})-g(\frac{\boldsymbol{x}}{\boldsymbol{A}})\right)\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x},\quad n\in\mathbb{N},~{}~{}\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}.

(C.1)

The term $H_{h,k}^{(n)}(\boldsymbol{\omega})$ and $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ frequently appears throughout the proof of main results. For example, they are related to the expression of $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$ . See Lemma D.1 below.

In this section, our focus is to investigate the representations and approximations of $H_{h,k}^{(n)}(\boldsymbol{\omega})$ and $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . We first begin with $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . Since the support of $h(\cdot/\boldsymbol{A})$ is $D_{n}$ , $R_{h,g}^{(n)}$ can be written as $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})=R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})+R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ , where

	$\displaystyle R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$	$\displaystyle=\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})\left(g(\frac{\boldsymbol{x}+\boldsymbol{t}}{\boldsymbol{A}})-g(\frac{\boldsymbol{x}}{\boldsymbol{A}})\right)\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x}$		(C.2)
	$\displaystyle\text{and}\quad R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$	$\displaystyle=-\int_{D_{n}\backslash(D_{n}-\boldsymbol{t})}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})g(\frac{\boldsymbol{x}}{\boldsymbol{A}})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x},\quad\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}.$		(C.2)

In the theorem below, we obtain a rough bound for $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . Let

\rho(x)=\min(x,1),\quad x\in\mathbb{R}.

Throughout this section, we let $C\in(0,\infty)$ be a generic constant that varies line by line.

Theorem C.1.

Let $h$ and $g$ are the data taper on a compact support $[-1/2,1/2]^{d}$ . Suppose $\lim_{n\rightarrow\infty}|D_{n}|=\infty$ , $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}h(\boldsymbol{\omega})<\infty$ , and $g$ satisfies Assumption 3.4(i). Then,

			$\displaystyle\sup_{n\in\mathbb{N}}\sup_{\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}}\|D_{n}\|^{-1}\|R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|<\infty$		(C.3)
	and		$\displaystyle\|D_{n}\|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}\|R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|=o(1),~{}~{}\boldsymbol{t}\in\mathbb{R}^{d},~{}~{}n\rightarrow\infty.$		(C.4)

If we further assume that $g$ is Lipschitz continuous on $[-1/2,1/2]^{d}$ , then the right hand side of (C.4) is bounded by $C\rho(\|\boldsymbol{t}/\boldsymbol{A}\|)$ for some $C\in(0,\infty)$ that does not depend on $\boldsymbol{t}\in\mathbb{R}^{d}$ .

Proof. Recall (C.2). We bound $R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ and $R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ separately. First, $R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ is bounded by

|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|\leq\sup_{\boldsymbol{\omega}}h(\boldsymbol{\omega})\int_{D_{n}\backslash(D_{n}-\boldsymbol{t})}\left|g(\boldsymbol{x}/\boldsymbol{A})\right|d\boldsymbol{x}\leq C|D_{n}\backslash(D_{n}-\boldsymbol{t})|.

Since $D_{n}$ has a rectangle shape, it is easily seen that

|D_{n}\backslash(D_{n}-\boldsymbol{t})|\leq|D_{n}|\rho(\sum_{i=1}^{d}|t_{i}/A_{i}|)\leq C|D_{n}|\rho(\|\boldsymbol{t}/\boldsymbol{A}\|),\quad\boldsymbol{t}\in\mathbb{R}^{d}.

(C.5)

Therefore, $|D_{n}|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|$ is bounded by

|D_{n}|^{-1}|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|\leq C\rho(\|\boldsymbol{t}/\boldsymbol{A}\|),\quad n\in\mathbb{N},~{}~{}\boldsymbol{t}\in\mathbb{R}^{d}.

(C.6)

Next, we bound $R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . We note that

		$\displaystyle\|D_{n}\|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}\|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$		(C.7)
		$\displaystyle~{}~{}\leq\|D_{n}\|^{-1}\sup_{\boldsymbol{\omega}}h(\boldsymbol{\omega})\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}\left\|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right\|d\boldsymbol{x}$
		$\displaystyle~{}~{}\leq C\sup_{\boldsymbol{x}\in D_{n}\cap(D_{n}-\boldsymbol{t})}\left\|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right\|.$

Since $g$ is continuous on a compact support, $g$ is bounded and uniformly continuous in $[-1/2,1/2]^{d}$ . Therefore,

\sup_{\boldsymbol{t}\in\mathbb{R}^{d}}\sup_{\boldsymbol{x}\in D_{n}\cap(D_{n}-\boldsymbol{t})}\left|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right|<\infty

and for fixed $\boldsymbol{t}\in\mathbb{R}^{d}$ ,

\lim_{n\rightarrow\infty}\sup_{\boldsymbol{x}\in D_{n}\cap(D_{n}-\boldsymbol{t})}\left|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right|=0.

Substitute the above two into (C.7), we have $\sup_{n\in\mathbb{N}}\sup_{\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}}|D_{n}|^{-1}|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|<\infty$ and

\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}|D_{n}|^{-1}|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|=o(1),\quad\boldsymbol{t}\in\mathbb{R}^{d},~{}~{}n\rightarrow\infty.

Combining these results with (C.6), we show (C.3) and (C.4).

If we further assume that $g$ is Lipschitz continuous on $[-1/2,1/2]^{d}$ , then

\sup_{\boldsymbol{x}\in D_{n}\cap(D_{n}-\boldsymbol{t})}\left|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right|\leq C\rho(\|\boldsymbol{t}/\boldsymbol{A}\|),\quad n\in\mathbb{N},~{}~{}\boldsymbol{t}\in\mathbb{R}^{d}.

Therefore, from (C.6) and (C.7), we have

|D_{n}|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}|R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|\leq C\rho(\|\boldsymbol{t}/\boldsymbol{A}\|),\quad n\in\mathbb{N},~{}~{}\boldsymbol{t}\in\mathbb{R}^{d}.

Therefore, we prove the assertion. All together, we get the desired results. $\Box$

To show the limiting variance of the integrated periodogram in (4.2), we require sharper bound for $R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . Below, we give a bound for $|D_{n}|^{1/2}|R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|$ , provided $h$ and $g$ are either constant in $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(ii) for $m=d+1$ . To do so, let $\{h_{\boldsymbol{j}}\}_{\boldsymbol{j}\in\mathbb{Z}^{d}}$ be the Fourier coefficients of $h(\cdot)$ that satisfies

h(\boldsymbol{x})=\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}\exp(2\pi i\boldsymbol{j}^{\top}\boldsymbol{x}),\quad\boldsymbol{x}\in[-1/2,1/2]^{d}.

(C.8)

The Fourier coefficients $\{g_{\boldsymbol{j}}\}_{\boldsymbol{j}\in\mathbb{Z}^{d}}$ of $g(\cdot)$ are defined in the same manner. For centered rectangle $R\in\mathbb{R}^{d}$ (which also includes the degenerate rectangles), we let

c_{R}(\boldsymbol{\omega})=\begin{cases}(2\pi)^{-d/2}|R|^{-1/2}\int_{R}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x},&|R|>0,\\ 0,&|R|=0,\end{cases}\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(C.9)

Theorem C.2.

Let $h$ and $g$ are the data taper on a compact support $[-1/2,1/2]^{d}$ . Suppose $h$ and $g$ are either constant on $[-1/2,1/2]^{d}$ or satisfy Assumption 3.4(ii) for $m=d+1$ . Then, the following two assertions hold.

(i)

$R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ satisfies the following identity

	$\displaystyle R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$	$\displaystyle=$	$\displaystyle\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}g_{\boldsymbol{k}}\left(\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{t}/\boldsymbol{A}))-1\right)\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})}d\boldsymbol{x}$
			$\displaystyle\quad-\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}g_{\boldsymbol{k}}\int_{D_{n}\backslash(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})}d\boldsymbol{x},\quad n\in\mathbb{N},~{}~{}\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}.$

(ii)

Let $m_{d}=2^{d}-1$ . Then, there exist $C\in(0,\infty)$ and $m_{d}$ number of sequences of centered rectangles (which may include degenerate rectangles) $\{D_{n,i}(\boldsymbol{t})\}_{i=0}^{m_{d}}$ where $D_{n,i}(\boldsymbol{t})$ depends only on $D_{n}$ and $\boldsymbol{t}\in\mathbb{R}^{d}$ such that for $n\in\mathbb{N}$ and $\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}$ ,

	$\displaystyle\|D_{n}\|^{-1/2}\|R_{h,g}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$
	$\displaystyle\quad\leq C\sum_{i=0}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\{\\|\boldsymbol{k}\\|+1\}\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})\|.$

Proof. First, we will show (i). We will assume that $h$ and $g$ both satisfies Assumption 3.4(ii) for $m=d+1$ . The case when either $h$ or $g$ is constant on $[-1/2,1/2]^{d}$ is straightforward since the corresponding Fourier coefficients for $h(\boldsymbol{x})\equiv c$ in $[-1/2,1/2]^{d}$ are $h_{\boldsymbol{j}}=c$ if $\boldsymbol{j}=\textbf{0}$ and zero otherwise. By using Folland (1999), Theorem 8.22(e) (see also an argument on page 257 of the same reference), we have $|h_{\boldsymbol{j}}|,|g_{\boldsymbol{j}}|\leq(1+\|\boldsymbol{j}\|)^{-d-1}$ and thus both $\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}|$ and $\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|g_{\boldsymbol{j}}|$ are finite. For $\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}$ ,

		$\displaystyle R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$		(C.10)
		$\displaystyle~{}~{}=\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})\left(g(\frac{\boldsymbol{x}+\boldsymbol{t}}{\boldsymbol{A}})-g(\frac{\boldsymbol{x}}{\boldsymbol{A}})\right)e^{-i\boldsymbol{x}^{\top}\boldsymbol{\omega}}d\boldsymbol{x}$
		$\displaystyle~{}~{}=\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}g_{\boldsymbol{k}}\exp(2\pi i\boldsymbol{j}^{\top}(\boldsymbol{x}/\boldsymbol{A}))\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{x}/\boldsymbol{A}))$
		$\displaystyle\qquad\times\left(\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{t}/\boldsymbol{A}))-1\right)\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x}$
		$\displaystyle~{}~{}=\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}g_{\boldsymbol{k}}\left(\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{t}/\boldsymbol{A}))-1\right)\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}.$

Here, we use Fubini’s theorem in the second identity which is due to $\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}|<\infty$ and $\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|g_{\boldsymbol{j}}|<\infty$ . Similarly, for $\boldsymbol{t},\boldsymbol{\omega}\in\mathbb{R}^{d}$ ,

R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})=-\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}h_{\boldsymbol{j}}g_{\boldsymbol{k}}\int_{D_{n}\backslash(D_{n}-\boldsymbol{t})}\exp(-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega}))d\boldsymbol{x}.

(C.11)

Combining the above two expressions, we show (i).

Next, we will show (ii). We first focus on $R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . From (C.10), we have

	$\displaystyle\|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$	$\displaystyle\leq$	$\displaystyle\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\|\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{t}/\boldsymbol{A}))-1\|\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|$
		$\displaystyle\leq$	$\displaystyle C\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\\|\boldsymbol{k}\\|\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|.$

Here, we use

|e^{i\boldsymbol{x}^{\top}\boldsymbol{y}}-e^{0}|\leq 2\rho(|\boldsymbol{x}^{\top}\boldsymbol{y}|)\leq 2\rho(\|\boldsymbol{x}\|\|\boldsymbol{y}\|)\leq 2\rho(\|\boldsymbol{x}\|\|\boldsymbol{y}\|)^{1/2},\quad\boldsymbol{x},\boldsymbol{y}\in\mathbb{R}^{d}

in the second inequality. We note that $D_{n}\cap(D_{n}-\boldsymbol{t})$ is also a rectangle, and $|c_{R+\boldsymbol{x}}(\boldsymbol{\omega})|=|c_{R}(\boldsymbol{\omega})|$ for all $\boldsymbol{x},\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Therefore, for the centered version of the rectangle $D_{n}\cap(D_{n}-\boldsymbol{t})$ , denoted $D_{n,0}(\boldsymbol{t})$ , we have

	$\displaystyle\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|$	$\displaystyle=$	$\displaystyle(2\pi)^{d/2}\|D_{n}\cap(D_{n}-\boldsymbol{t})\|^{1/2}\|c_{D_{n,0}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
		$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{1/2}\|c_{D_{n,0}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$

Substitute this into the upper bound of $|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})|$ , we have

		$\displaystyle\|D_{n}\|^{-1/2}\|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$		(C.12)
		$\displaystyle~{}~{}\leq C\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\{\\|\boldsymbol{k}\\|+1\}\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\|c_{D_{n,0}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$		(C.12)

Secondly, we focus in $R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})$ . We first note that $D_{n}\backslash(D_{n}-\boldsymbol{t})$ can be written as a disjoint union of finite number of rectangles, where the number of the rectangles are at most $m_{d}=2^{d}-1$ . See the example in the Figure C.1 below for $d=2$ .

Let $D_{n}\backslash(D_{n}-\boldsymbol{t})=\cup_{i=1}^{m_{d}}\widetilde{D}_{n,i}(\boldsymbol{t})$ be the disjoint union of rectangles and $D_{n,i}(\boldsymbol{t})$ is the centered $\widetilde{D}_{n,i}(\boldsymbol{t})$ . Then, by using (C.11), we have

$\displaystyle\|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$	$\displaystyle\leq$	$\displaystyle(2\pi)^{d/2}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\sum_{i=1}^{m_{d}}\|D_{n,i}(\boldsymbol{t})\|^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
	$\displaystyle\leq$	$\displaystyle(2\pi)^{d/2}\sum_{i=1}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\|D_{n}\backslash(D_{n}-\boldsymbol{t})\|^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
	$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{1/2}\sum_{i=1}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$

Here, we use (C.5) in the last inequality above. Therefore, we have

		$\displaystyle\|D_{n}\|^{-1/2}\|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$		(C.13)
		$\displaystyle~{}\leq C\sum_{i=1}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho\left((\\|\boldsymbol{k}\\|+1)\\|\boldsymbol{t}/\boldsymbol{A}\\|\right)^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$		(C.13)

Combining (C.12) and (C.13), we get the desired result. All together, we prove the theorem. $\Box$

Next, we focus on $H_{h,k}^{(n)}(\cdot)$ in (2.7). The following lemma provides an approximation of $H_{h,k}^{(n)}(\cdot)$ .

Lemma C.1.

Let $h(\cdot)$ be a data taper that satisfies Assumption 3.4(i). Then, for $k\in\{2,3,\dots,\}$ ,

\sup_{n\in\mathbb{N}}\sup_{\boldsymbol{\omega},\boldsymbol{u}_{1},\dots,\boldsymbol{u}_{k-1}\in\mathbb{R}^{d}}|D_{n}|^{-1}\bigg{|}H_{h,k}^{(n)}(\boldsymbol{\omega})-\int_{\mathbb{R}^{d}}\left(h(\boldsymbol{x}/\boldsymbol{A})\prod_{j=1}^{k-1}h((\boldsymbol{x}+\boldsymbol{u}_{j})/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})\right)d\boldsymbol{x}\bigg{|}<\infty.

Suppose $\lim_{n\rightarrow\infty}|D_{n}|=\infty$ . Then, for fixed $\boldsymbol{u}_{1},\dots,\boldsymbol{u}_{k-1}\in\mathbb{R}^{d}$ , as $n\rightarrow\infty$ ,

|D_{n}|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}\bigg{|}H_{h,k}^{(n)}(\boldsymbol{\omega})-\int_{\mathbb{R}^{d}}\left(h(\boldsymbol{x}/\boldsymbol{A})\prod_{j=1}^{k-1}h((\boldsymbol{x}+\boldsymbol{u}_{j})/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})\right)d\boldsymbol{x}\bigg{|}=o(1).

If further assume that $h$ is Lipschitz continuous on $[-1/2,1/2]^{d}$ , then the left term above is bounded by $C\sum_{j=1}^{k-1}\rho(\|\boldsymbol{u}_{j}/\boldsymbol{A}\|)$ uniformly in $\boldsymbol{\omega}\in\mathbb{R}^{d}$ .

Proof. We will show the lemma for $k=2$ . The case for $k\geq 3$ is treated similarly (cf. Brillinger (1981), page 402). For $k=2$ , the left hand side above is equal to $|D_{n}|^{-1}|R_{h,h}^{(n)}(\boldsymbol{u}_{1},\boldsymbol{\omega})|$ . Thus, by Theorem C.1, the above is true for $k=2$ . $\Box$

Next, we bound $H_{h,k}^{(n)}(\boldsymbol{\omega})$ for $\boldsymbol{\omega}\neq\textbf{0}$ . The lemma below together with Theorem C.1 are used to prove the asymptotic orthogonality of the DFT in Theorem 3.1.

Lemma C.2.

Let $\{D_{n}\}$ satisfies Assumption 3.1. Let $h$ be a data taper such that $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}h(\boldsymbol{\omega})<\infty$ . Let $\{\boldsymbol{\omega}_{n}\}$ be a sequence on $\mathbb{R}^{d}$ that is asymptotically distant from $\{\textbf{0}\}$ . Then,

|D_{n}|^{-1}|H_{h,k}^{(n)}(\boldsymbol{\omega}_{n})|=o(1),\quad n\rightarrow\infty.

(C.14)

Proof. Since $h$ has a compact support and $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}h(\boldsymbol{\omega})<\infty$ , $h\in L^{1}(\mathbb{R}^{d})$ . Therefore,

	$\displaystyle\|D_{n}\|^{-1}H_{h,k}^{(n)}(\boldsymbol{\omega}_{n})$	$\displaystyle=$	$\displaystyle\|D_{n}\|^{-1}\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}_{n})d\boldsymbol{x}=\int_{[-1/2,1/2]^{d}}h(\boldsymbol{y})\exp(-i\boldsymbol{y}^{\top}(\boldsymbol{A}\cdot\boldsymbol{\omega}_{n}))d\boldsymbol{y}$
		$\displaystyle=$	$\displaystyle(2\pi)^{d}\mathcal{F}^{-1}(h)(\boldsymbol{A}\cdot\boldsymbol{\omega}_{n}).$

Here, we use change of variables $\boldsymbol{y}=\boldsymbol{x}/\boldsymbol{A}$ in the second identity. Note that $\|\boldsymbol{A}\cdot\boldsymbol{\omega}_{n}\|_{\infty}\geq C|D_{n}|^{1/d}\|\boldsymbol{\omega}_{n}\|_{\infty}\rightarrow\infty$ as $n\rightarrow\infty$ due to Assumption 3.1. Therefore, by Riemann-Lebesgue lemma, we have $\lim_{n\rightarrow\infty}|\mathcal{F}^{-1}(h)(\boldsymbol{A}\cdot\boldsymbol{\omega}_{n})|=0$ . Thus, we get the desired result. $\Box$

Finally, we generalize the Fejér Kernel that are associated with data taper $h$ (cf. Matsuda and Yajima (2009), Equation (21)). For $h(\cdot)$ with support on $[-1/2,1/2]^{d}$ , let

F_{h,n}(\boldsymbol{\omega})=(2\pi)^{-d}H_{h,2}^{-1}|D_{n}|^{-1}|H_{h,1}^{(n)}(\boldsymbol{\omega})|^{2}=|c_{h,n}(\boldsymbol{\omega})|^{2},\quad n\in\mathbb{N},~{}~{}\boldsymbol{\omega}\in\mathbb{R}^{d}.

(C.15)

where $c_{h,n}(\boldsymbol{\omega})$ is defined as in (2.10) and $H_{h,2}=\int_{[-1/2,1/2]^{d}}h(\boldsymbol{x})^{2}d\boldsymbol{x}$ .

Lemma C.3.

Let $c_{h,n}(\boldsymbol{\omega})$ and $F_{h,n}(\boldsymbol{\omega})$ be defined as in (2.10) and (C.15), respectively. Then, the follow assertions hold:

(a)

$\int_{\mathbb{R}^{d}}F_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega}=1$ .

(b)

Suppose that Assumptions 3.1 and 3.4(ii)(for $m=1$ ) hold. Then, for $\phi(\boldsymbol{\omega})$ with bounded second derivatives, we have

\int_{\mathbb{R}^{d}}\phi(\boldsymbol{\omega})F_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega}=\phi(\textbf{0})+O(|D_{n}|^{-2/d}),\quad n\rightarrow\infty.

(c)

Suppose that Assumptions 3.1 and 3.4(ii) (for $m=d+1$ ) hold. Then, for a bounded function $\phi$ ,

\left|\int_{\mathbb{R}^{d}}\phi(\boldsymbol{\omega})c_{h,n}(\boldsymbol{\omega})d\boldsymbol{\omega}\right|=O(|D_{n}|^{-1/2}),\quad n\rightarrow\infty.

(d)

For a bounded function $\phi(\boldsymbol{\omega})$ , $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , there exists $C\in(0,\infty)$ which depends only on $\phi(\cdot)$ such that

\int_{\mathbb{R}^{d}}|\phi(\boldsymbol{\omega})c_{h_{1},n}(\boldsymbol{\omega})c_{h_{2},n}(\boldsymbol{\omega}+\boldsymbol{u})|d\boldsymbol{\omega}\leq C,\qquad n\in\mathbb{N}.

(e)

For a bounded and compactly supported function $\phi(\boldsymbol{\omega})$ , there exists $C\in(0,\infty)$ which depends only on $\phi(\cdot)$ such that

$\int_{\mathbb{R}^{d}}|\phi(\boldsymbol{\omega})c_{h,n}(\boldsymbol{\omega})|d\boldsymbol{\omega}\leq C,\quad n\in\mathbb{N}.$

Proof. (a)–(d) are due to Matsuda and Yajima (2009), Lemmas 1 and 2. The upper bound of (d) is $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}|\phi(\boldsymbol{\omega})|$ which depends only on $\phi(\cdot)$ . To show (e), let $D\subset\mathbb{R}^{d}$ be the compact support of $\phi$ . Then, by using Cauchy-Schwarz inequality and point (a),

\int_{\mathbb{R}^{d}}|\phi(\boldsymbol{\omega})c_{h,n}(\boldsymbol{\omega})|d\boldsymbol{\omega}\leq\left(\int_{D}\phi(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}\right)^{1/2},\quad n\in\mathbb{N}.

Thus, we prove (e). All together, we get the desired results. $\Box$

Appendix D Bounds for the cumulants of the DFT

In this section, we study the expressions and bounds of terms that are written in terms of the product of cumulants of the DFT. This section is essential to prove the asympotic orthogonality of the DFTs in Theorem 3.1 and includes the limit of the variance of the integrated periodogram in Section 4. Throughout the section, we let $C\in(0,\infty)$ be a generic constant that varies line by line.

D.1 Expressions of the covariance of the DFTs

To begin with, we obtain the expressions for $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$ in terms of the second-order measures. Note that these expressions above were also verified in Rajala et al. (2023), Proposition IV.1 and Equation (47).

Theorem D.1.

Let $X$ be a second-order stationary point process on $\mathbb{R}^{d}$ and let $h$ be the taper function such that $\sup_{\boldsymbol{x}\in\mathbb{R}^{d}}h(\boldsymbol{x})<\infty$ . Suppose that Assumption 3.2 holds for $\ell=2$ . Then, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ , we have

	$\displaystyle\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle\quad=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\lambda\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{2}e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})}d\boldsymbol{x}$
	$\displaystyle\quad~{}~{}+(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{D_{n}^{2}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}\gamma_{2,\text{red}}(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}$		(D.1)
	$\displaystyle\quad=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{D_{n}^{2}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}C(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}.$		(D.2)

Suppose that Assumptions 4.1(i) and 3.2(for $\ell=2$ ) hold. Then, for $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , we have

\mathrm{var}(J_{h,n}(\boldsymbol{\omega}))=\int_{\mathbb{R}^{d}}f(\boldsymbol{x})|c_{h,n}(\boldsymbol{\omega}-\boldsymbol{x})|^{2}d\boldsymbol{x}.

(D.3)

Proof. We will only show (D.1) and (D.2) for the non-tapered case. A general case can be treated similarly. Since $J_{n}(\boldsymbol{\omega})=\mathcal{J}_{n}(\boldsymbol{\omega})-\lambda c_{n}(\boldsymbol{\omega})$ is centered DFT and $\lambda c_{n}(\boldsymbol{\omega})$ is deterministic, we have

\mathrm{cov}(J_{n}(\boldsymbol{\omega}_{1}),J_{n}(\boldsymbol{\omega}_{2}))=\mathbb{E}[\mathcal{J}_{n}(\boldsymbol{\omega}_{1})\mathcal{J}_{n}(-\boldsymbol{\omega}_{2})]-\lambda^{2}c_{n}(\boldsymbol{\omega}_{1})c_{n}(-\boldsymbol{\omega}_{2}).

By using (2.9),

\mathcal{J}_{n}(\boldsymbol{\omega}_{1})\mathcal{J}_{n}(-\boldsymbol{\omega}_{2})=(2\pi)^{-d}|D_{n}|^{-1}\bigg{(}\sum_{\boldsymbol{x}\in X\cap D_{n}}\exp(-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2}))+\sum_{\boldsymbol{x}\neq\boldsymbol{y}\in X\cap D_{n}}\exp(-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2}))\bigg{)}.

Thus, by using (2.1) for both terms above, we have

	$\displaystyle\mathbb{E}[\mathcal{J}_{n}(\boldsymbol{\omega}_{1})\mathcal{J}_{n}(-\boldsymbol{\omega}_{2})]$	$\displaystyle=$	$\displaystyle(2\pi)^{-d}\|D_{n}\|^{-1}\bigg{\{}\lambda\int_{D_{n}}e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})}d\boldsymbol{x}$
			$\displaystyle+\int_{D_{n}}\int_{D_{n}}\exp(-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2}))\lambda_{2,\text{red}}(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}\bigg{\}}.$

Next, by using that $\int_{D_{n}^{2}}\exp(-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2}))d\boldsymbol{x}d\boldsymbol{y}=(2\pi)^{d}|D_{n}|c_{n}(\boldsymbol{\omega}_{1})c_{n}(-\boldsymbol{\omega}_{2})$ and $\gamma_{2,\text{red}}(\boldsymbol{x})=\lambda_{2,\text{red}}(\boldsymbol{x})-\lambda^{2}$ , we have

	$\displaystyle\mathbb{E}[\mathcal{J}_{n}(\boldsymbol{\omega}_{1})\mathcal{J}_{n}(-\boldsymbol{\omega}_{2})]-\lambda^{2}c_{n}(\boldsymbol{\omega}_{1})c_{n}(-\boldsymbol{\omega}_{2})=(2\pi)^{-d}\|D_{n}\|^{-1}\lambda\int_{D_{n}}e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})}d\boldsymbol{x}$
	$\displaystyle~{}~{}+(2\pi)^{-d}\|D_{n}\|^{-1}\int_{D_{n}}\int_{D_{n}}\exp(-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2}))\gamma_{2,\text{red}}(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}.$

Therefore, we show (D.1) for $h(\boldsymbol{x})\equiv 1$ on $[-1/2,1/2]^{d}$ .

(D.2) can be easily seen by using the above identity and $C(\boldsymbol{x})=\lambda\delta(\boldsymbol{x})+\gamma_{2,\text{red}}(\boldsymbol{x})$ .

Lastly, to show (D.3), by substituting (4.4) into (D.1) for $\boldsymbol{\omega}_{1}=\boldsymbol{\omega}_{2}=\boldsymbol{\omega}$ , we have

	$\displaystyle\mathrm{var}(J_{h,n}(\boldsymbol{\omega}))$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\lambda\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{2}d\boldsymbol{x}+(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{D_{n}^{2}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})$
	$\displaystyle~{}~{}\quad\times e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}-\boldsymbol{y}^{\top}\boldsymbol{\omega})}\int_{\mathbb{R}^{d}}e^{i(\boldsymbol{x}-\boldsymbol{y})^{\top}\boldsymbol{t}}(f(\boldsymbol{t})-(2\pi)^{-d}\lambda)d\boldsymbol{t}d\boldsymbol{x}d\boldsymbol{y}.$

Since $f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda\in L^{1}(\mathbb{R}^{d})$ due to Assumption 4.1(i) and $\sup_{\boldsymbol{x}\in\mathbb{R}^{d}}h(\boldsymbol{x})<\infty$ , we can apply Fubini’s theorem to interchange the summation above and get

	$\displaystyle\mathrm{var}(J_{h,n}(\boldsymbol{\omega}))$
	$\displaystyle\quad=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\lambda\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{2}d\boldsymbol{x}+(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}d\boldsymbol{t}(f(\boldsymbol{t})-(2\pi)^{-d}\lambda)$
	$\displaystyle\quad~{}~{}~{}\times\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})e^{-i\boldsymbol{x}^{\top}(\boldsymbol{\omega}-\boldsymbol{t})}d\boldsymbol{x}\int_{D_{n}}h(\boldsymbol{y}/\boldsymbol{A})e^{-i\boldsymbol{y}^{\top}(\boldsymbol{t}-\boldsymbol{\omega})}d\boldsymbol{y}$
	$\displaystyle\quad=(2\pi)^{-d}\lambda+\int_{\mathbb{R}^{d}}\{f(\boldsymbol{t})-(2\pi)^{-d}\lambda\}\|c_{h,n}(\boldsymbol{\omega}-\boldsymbol{t})\|^{2}d\boldsymbol{t}=\int_{\mathbb{R}^{d}}f(\boldsymbol{t})\|c_{h,n}(\boldsymbol{\omega}-\boldsymbol{t})\|^{2}d\boldsymbol{t}.$

Here, we use (2.10) and $\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{2}d\boldsymbol{x}=H_{h,2}|D_{n}|$ on the second identity and Lemma C.3(a) on the last identity. Thus, we get the desired results.

All together, we prove the theorem. $\Box$

Now, we give an expression of $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$ in terms of $H_{h,k}^{(n)}$ and $R_{h,g}^{(n)}$ in (2.7) and (C.1), respectively.

Lemma D.1.

Suppose that Assumption 3.2 holds for $\ell=2$ . Then, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ ,

	$\displaystyle\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})\left(H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})+R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})\right)d\boldsymbol{u}.$

Proof. By using (D.2) and using that $h(\cdot/\boldsymbol{A})$ has a support on $D_{n}$ ,

	$\displaystyle\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}C(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}d\boldsymbol{u}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})\int_{\mathbb{R}^{d}}h((\boldsymbol{u}+\boldsymbol{v})/\boldsymbol{A})h(\boldsymbol{v}/\boldsymbol{A})e^{-i\boldsymbol{v}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})}d\boldsymbol{v}$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}d\boldsymbol{u}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})\left(H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})+R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})\right).$

Here, we use change of variables $\boldsymbol{u}=\boldsymbol{x}-\boldsymbol{y}$ and $\boldsymbol{v}=\boldsymbol{y}$ in the second identity. Thus, we get the desired result. $\Box$

Using the above lemma together with bound for $R_{h,h}^{(n)}(\cdot)$ in Theorem C.1, we obtain the leading term of $\mathrm{cov}(J_{n}(\boldsymbol{\omega}_{1}),J_{n}(\boldsymbol{\omega}_{2}))$ .

Theorem D.2.

Suppose that Assumptions 3.1, 3.2 (for $\ell=2$ ) and 3.4(i) hold. Let $f$ be the spectral density function and let $H_{h,k}^{(n)}$ be defined as in (2.7). Then, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ ,

\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))=|D_{n}|^{-1}H_{h,2}^{-1}f(\boldsymbol{\omega}_{1})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})+o(1),\quad~{}~{}n\rightarrow\infty.

(D.4)

Here, the $o(1)$ error above is uniform over $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ .

Proof. By using Lemma D.1 and $C(\cdot)=\lambda\delta(\cdot)+\gamma_{2,\text{red}}(\cdot)$ , the first term in the expansion of $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$ is

	$\displaystyle(2\pi)^{-d}\lambda\|D_{n}\|^{-1}H_{h,2}^{-1}H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})+\|D_{n}\|^{-1}H_{h,2}^{-1}H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})\mathcal{F}^{-1}(\gamma_{2,\text{red}})(\boldsymbol{\omega}_{1})$
	$\displaystyle~{}~{}=\|D_{n}\|^{-1}H_{h,2}^{-1}\left((2\pi)^{-d}\lambda+\mathcal{F}^{-1}(\gamma_{2,\text{red}})(\boldsymbol{\omega}_{1})\right)H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})$
	$\displaystyle~{}~{}=\|D_{n}\|^{-1}H_{h,2}^{-1}\cdot f(\boldsymbol{\omega}_{1})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2}).$

Here, we use (2.5) in the last identity. Similarly, the remainder term of the difference between $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$ and $|D_{n}|^{-1}H_{h,2}^{-1}f(\boldsymbol{\omega}_{1})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})$ is bounded by

C|D_{n}|^{-1}|R_{h,h}^{(n)}(\textbf{0},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})|+C|D_{n}|^{-1}\int_{\mathbb{R}^{d}}|\gamma_{2,\text{red}}(\boldsymbol{u})||R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})|d\boldsymbol{u}.

(D.5)

By using Theorem C.1, the first term above is $o(1)$ as $n\rightarrow\infty$ uniformly in $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ . To bound the second term, by using C.1 again, we have $|D_{n}|^{-1}|\gamma_{2,\text{red}}(\boldsymbol{u})||R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})|\leq C|\gamma_{2,\text{red}}(\boldsymbol{u})|\in L^{1}(\mathbb{R}^{d})$ and $\lim_{n\rightarrow\infty}|D_{n}|^{-1}|\gamma_{2,\text{red}}(\boldsymbol{u})||R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})|=0$ , $\boldsymbol{u}\in\mathbb{R}^{d}$ . Therefore, by dominated convergence theorem, the second term above is $o(1)$ as $n\rightarrow\infty$ uniformly in $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ . Thus, we get the desired result. $\Box$

D.2 Bounds on the terms involving covariances

Let

	$\displaystyle T_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2})$	$\displaystyle=\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})$		(D.6)
		$\displaystyle~{}~{}\times C(\boldsymbol{x}-\boldsymbol{y})C(\boldsymbol{t}_{1}+\boldsymbol{x}-\boldsymbol{t}_{2}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y},~{}~{}\boldsymbol{t}_{1},\boldsymbol{t}_{2}\in\mathbb{R}^{d}.$		(D.6)

The term $T_{n}$ appears in the integrated form of $\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))$ in the proof of Theorem 4.1. Below, we give an approximation of $T_{n}$ . Let

\widetilde{f}(\boldsymbol{\omega})=f(\boldsymbol{\omega})-(2\pi)^{-d}\lambda,\qquad\boldsymbol{\omega}\in\mathbb{R}^{d},

(D.7)

and let

		$\displaystyle R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})$		(D.8)
		$\displaystyle~{}~{}=H_{h,2}^{(n)}(-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})R_{h,h}^{(n)}(\boldsymbol{t}_{2},\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})+H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})R_{h,h}^{(n)}(\boldsymbol{t}_{1},-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})$
		$\displaystyle~{}~{}~{}~{}+R_{h,h}^{(n)}(\boldsymbol{t}_{2},\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})R_{h,h}^{(n)}(\boldsymbol{t}_{1},-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2}),\quad n\in\mathbb{N},~{}~{}\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}.$

Lemma D.2.

Let $\gamma_{2,\emph{red}}$ and $C(\cdot)$ be the reduced second-order cumulant intensity function and the complete covariance function defined as in (2.4). Suppose that Assumptions 3.1, 3.2 (for $\ell=2$ ), and 4.1(i) hold. Furthermore, the data taper $h$ is Lipschitz continuous on $[-1/2,1/2]^{d}$ . Then, for $\boldsymbol{t}_{1},\boldsymbol{t}_{2}\in D_{n}-D_{n}$ ,

$\displaystyle T_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2})$	$\displaystyle=$	$\displaystyle\lambda^{2}\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+O(\|D_{n}\|^{-1/d})\\|\boldsymbol{t}_{2}\\|\right)$
		$\displaystyle+2\lambda\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+O(\|D_{n}\|^{-1/d})[\\|\boldsymbol{t}_{1}\\|+\\|\boldsymbol{t}_{2}\\|]\right)$
		$\displaystyle+\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})\bigg{(}(2\pi)^{d}H_{h,4}F_{h^{2},n}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})$
		$\displaystyle\quad+\|D_{n}\|^{-1}R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})\bigg{)}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},\qquad n\rightarrow\infty,$

where $F_{h^{2},n}$ is the Fejér kernel in (C.15) based on $h^{2}$ .

Proof. Recall $C(\boldsymbol{x})=\lambda\delta(\boldsymbol{x})+\gamma_{2,\emph{red}}(\boldsymbol{x})$ . Substitute this to (D.6), $T_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2})$ can be decomposed into four terms. The first term is

\displaystyle|D_{n}|^{-1}\lambda^{2}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})\delta(\boldsymbol{x}-\boldsymbol{y})\delta(\boldsymbol{t}_{1}+\boldsymbol{x}-\boldsymbol{t}_{2}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}.

The above term is nonzero if and only if $\boldsymbol{t}_{1}=\boldsymbol{t}_{2}$ . Therefore, the first term is equivalent to

|D_{n}|^{-1}\lambda^{2}\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{y}}{\boldsymbol{A}})^{2}h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})^{2}d\boldsymbol{y}.

By applying Lemma C.1, we have

|D_{n}|^{-1}\lambda^{2}\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{y}}{\boldsymbol{A}})^{2}h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})^{2}d\boldsymbol{y}=\lambda^{2}\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+C\rho(\|\boldsymbol{t}_{2}/\boldsymbol{A}\|)\right),~{}n\rightarrow\infty.

Here, we use $|D_{n}|^{-1}\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{4}d\boldsymbol{x}=\int_{[-1/2,1/2]^{d}}h(\boldsymbol{x})^{4}d\boldsymbol{x}=H_{h,4}$ in the identity. Moreover, under Assumption 3.1, it is easily seen that $\rho(\|\boldsymbol{x}/\boldsymbol{A}\|)\leq\|\boldsymbol{x}/\boldsymbol{A}\|=O(|D_{n}|^{-1/d})\|\boldsymbol{x}\|$ as $n\rightarrow\infty$ . Therefore, the first term is $\lambda^{2}\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+O(|D_{n}|^{-1/d})\|\boldsymbol{t}_{2}\|\right)$ as $n\rightarrow\infty$ .

Similarly, the second and third terms are equal to

	$\displaystyle\|D_{n}\|^{-1}\lambda\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})\delta(\boldsymbol{x}-\boldsymbol{y})\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}+\boldsymbol{x}-\boldsymbol{t}_{2}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle=\|D_{n}\|^{-1}\lambda\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{y}}{\boldsymbol{A}})^{2}h(\frac{\boldsymbol{y}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})d\boldsymbol{y}$
	$\displaystyle=\lambda\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+O(\|D_{n}\|^{-1/d})(\\|\boldsymbol{t}_{1}\\|+\\|\boldsymbol{t}_{2}\\|)\right),\qquad n\rightarrow\infty.$

Finally, by using (4.4), the fourth term is equal to

	$\displaystyle\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})\gamma_{2,\emph{red}}(\boldsymbol{x}-\boldsymbol{y})\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}+\boldsymbol{x}-\boldsymbol{t}_{2}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle=\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})$
	$\displaystyle~{}~{}\times\int_{\mathbb{R}^{2d}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})e^{i(\boldsymbol{x}-\boldsymbol{y})^{\top}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{x}d\boldsymbol{y}.$

Since $\widetilde{f}\in L^{1}(\mathbb{R}^{d})$ due to Assumption 4.1(i), we can apply Fubini’s theorem and get

	$\displaystyle\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})$
	$\displaystyle~{}~{}\times\int_{\mathbb{R}^{2d}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})e^{i(\boldsymbol{x}-\boldsymbol{y})^{\top}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle=\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})\left(\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})e^{-i\boldsymbol{y}^{\top}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})}d\boldsymbol{y}\right)$
	$\displaystyle~{}~{}\times\left(\int_{\mathbb{R}^{d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})e^{i\boldsymbol{x}^{\top}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})}d\boldsymbol{x}\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$
	$\displaystyle=\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})\left(\|D_{n}\|^{-1}\|H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})\|^{2}+\|D_{n}\|^{-1}R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$

where $R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})$ is defined as in (D.8).

Next, from (C.15), the Fejér kernel based on $h^{2}$ is

F_{h^{2},n}(\boldsymbol{\omega})=(2\pi)^{-d}H_{h,4}^{-1}|D_{n}|^{-1}|H_{h,2}^{(n)}(\boldsymbol{\omega})|^{2}.

(D.9)

Substitute this into the above, we have

	$\displaystyle\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\frac{\boldsymbol{x}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}}{\boldsymbol{A}})h(\frac{\boldsymbol{x}+\boldsymbol{t}_{1}}{\boldsymbol{A}})h(\frac{\boldsymbol{y}+\boldsymbol{t}_{2}}{\boldsymbol{A}})$
	$\displaystyle~{}~{}\times\int_{\mathbb{R}^{2d}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})e^{i(\boldsymbol{x}-\boldsymbol{y})^{\top}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle=\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})\left((2\pi)^{d}H_{h,4}F_{h^{2},n}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})+\|D_{n}\|^{-1}R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

All together, we show the lemma. $\Box$

Next, we bound the term that are associated with $R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})$ in (D.8).

Lemma D.3.

Suppose the same set of assumptions in Theorem 4.1(ii) holds. Let $\widehat{\phi}_{M}$ and $\widetilde{f}$ be defined as in (A.4) and (D.7), respectively. Then,

	$\displaystyle\|D_{n}\|^{-1}\int_{B_{M}^{2}}d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$
	$\displaystyle~{}~{}=o(1),\quad n\rightarrow\infty.$		(D.10)

Proof. Recall (D.8). By using Theorem C.2(ii), the first term in the expression of
$|D_{n}|^{-1/2}R_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2})$ is bounded by

C|H_{h,2}^{(n)}(-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})|\sum_{i=0}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}||h_{\boldsymbol{k}}|\rho(\{\|\boldsymbol{k}\|+1\}\|\boldsymbol{t}_{2}/\boldsymbol{A}\|)^{1/2}|c_{D_{n,i}(\boldsymbol{t}_{2})}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})|.

Substitute the above into (D.10), the first term in the expansion of (D.10) is bounded by

	$\displaystyle C\sum_{i=0}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|h_{\boldsymbol{k}}\|\int_{B_{M}}\|\widehat{\phi}_{M}(\boldsymbol{t}_{1})\|d\boldsymbol{t}_{1}\int_{B_{M}}\|\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\|\rho\left((\\|\boldsymbol{k}\\|+1)\\|\boldsymbol{t}_{2}/\boldsymbol{A}\\|\right)^{1/2}d\boldsymbol{t}_{2}$
	$\displaystyle~{}~{}\times\int_{\mathbb{R}^{d}}\|\widetilde{f}(\boldsymbol{\omega}_{2})\|\int_{\mathbb{R}^{d}}\left\|\widetilde{f}(\boldsymbol{\omega}_{1})c_{h^{2},n}(-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})c_{D_{n,i}(\boldsymbol{t}_{2})}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})\right\|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

Since $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}|\widetilde{f}(\boldsymbol{\omega})|<\infty$ and $\widetilde{f}\in L^{1}(\mathbb{R}^{d})$ due to Assumption 4.1, we can apply Lemma C.3(d) and get

\int_{\mathbb{R}^{d}}|\widetilde{f}(\boldsymbol{\omega}_{2})|\int_{\mathbb{R}^{d}}\left|\widetilde{f}(\boldsymbol{\omega}_{1})c_{h^{2},n}(-\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})c_{D_{n,i}(\boldsymbol{t}_{2})}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A})\right|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}<\infty,

uniformly over $\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}$ . Thus, the above term is bounded by

\displaystyle C(m_{d}+1)\left(\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}|\right)\left(\int_{B_{M}}|\widehat{\phi}_{M}(\boldsymbol{t}_{1})|d\boldsymbol{t}_{1}\right)\left(\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}|h_{\boldsymbol{k}}|\int_{B_{M}}|\widehat{\phi}_{M}(-\boldsymbol{t}_{2})|\rho\left((\|\boldsymbol{k}\|+1)\|\boldsymbol{t}_{2}/\boldsymbol{A}\|\right)^{1/2}d\boldsymbol{t}_{2}\right).

We first note that $|h_{\boldsymbol{k}}|\lim_{n\rightarrow\infty}\int_{B_{M}}|\widehat{\phi}_{M}(\boldsymbol{t})|\rho(\{\|\boldsymbol{k}\|+1\}\|\boldsymbol{t}_{2}/\boldsymbol{A}\|)^{1/2}d\boldsymbol{t}_{2}=0$ due to the dominated convergence theorem. Moreover, since $\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}|h_{\boldsymbol{k}}|\int_{B_{M}}|\widehat{\phi}_{M}(\boldsymbol{t})|\rho(\{\|\boldsymbol{k}\|+1\}\|\boldsymbol{t}_{2}/\boldsymbol{A}\|)^{1/2}d\boldsymbol{t}_{2}<C\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}|h_{\boldsymbol{k}}|<\infty$ , by applying dominated convergence theorem again, we show that the above term is $o(1)$ as $n\rightarrow\infty$ .

Similarly, the second term in the decomposition of (D.10) is $o(1)$ as $n\rightarrow\infty$ .

Lastly, we bound the third term. By using Theorem C.2(ii) again, the third term in the decomposition of (D.10) is bounded by

	$\displaystyle C\sum_{p,q=0}^{m_{d}}\sum_{\boldsymbol{j}_{1},\boldsymbol{k}_{1},\boldsymbol{j}_{2},\boldsymbol{k}_{2}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}_{1}}\|\|h_{\boldsymbol{k}_{1}}\|\|h_{\boldsymbol{j}_{2}}\|\|h_{\boldsymbol{k}_{2}}\|\int_{B_{M}^{2}}d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}\|\widehat{\phi}_{M}(\boldsymbol{t}_{1})\|\|\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\|\rho\left((\\|\boldsymbol{k}_{1}\\|+1)\\|\boldsymbol{t}_{1}/\boldsymbol{A}\\|\right)^{1/2}$
	$\displaystyle~{}~{}\times\rho\left((\\|\boldsymbol{k}_{2}\\|+1)\\|\boldsymbol{t}_{2}/\boldsymbol{A}\\|\right)^{1/2}\int_{\mathbb{R}^{2d}}\|\widetilde{f}(\boldsymbol{\omega}_{1})\|\|\widetilde{f}(\boldsymbol{\omega}_{2})\|\|c_{D_{n,p}(\boldsymbol{t}_{1})}(-2\pi(\boldsymbol{j}_{1}+\boldsymbol{k}_{1})/\boldsymbol{A}-(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}))\|$
	$\displaystyle~{}~{}\times\|c_{D_{n,q}(\boldsymbol{t}_{2})}(-2\pi(\boldsymbol{j}_{2}+\boldsymbol{k}_{2})/\boldsymbol{A}+(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}))\|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

By using Lemma C.3(d) and $\widetilde{f}\in L^{1}(\mathbb{R}^{d})$ , the integral $\int_{\mathbb{R}^{2d}}(\cdots)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$ is bounded above and the upper bound is depends only on $\widetilde{f}$ . Therefore, the above is bounded by

C(m_{d}+1)^{2}\left(\sum_{\boldsymbol{j}\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}|\right)^{2}\left(\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}|h_{\boldsymbol{k}}|\int_{B_{M}}|\widehat{\phi}_{M}(\boldsymbol{t})|\rho\left((\|\boldsymbol{k}\|+1)\|\boldsymbol{t}/\boldsymbol{A}\|\right)^{1/2}d\boldsymbol{t}\right)^{2}.

By using a similar dominated convergence argument above, the third term is $o(1)$ as $n\rightarrow\infty$ .

All together, we show prove the lemma. $\Box$

Now, we are ready to compute the limit of $A_{1}$ in Section A.3. Recall

A_{1}=|D_{n}|\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cov}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.

Theorem D.3.

Suppose the same set of assumptions in Theorem 4.1(ii) holds. Then,

\lim_{n\rightarrow\infty}A_{1}=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}f(\boldsymbol{\omega})^{2}\phi_{M}(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}.

Proof. First, by using (D.2), we have

	$\displaystyle A_{1}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\int_{D_{n}^{2}}d\boldsymbol{x}d\boldsymbol{y}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})$
			$\displaystyle\times e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}C(\boldsymbol{x}-\boldsymbol{y})\int_{D_{n}^{2}}d\boldsymbol{u}d\boldsymbol{v}h(\boldsymbol{u}/\boldsymbol{A})h(\boldsymbol{v}/\boldsymbol{A})e^{i(\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{v}^{\top}\boldsymbol{\omega}_{2})}C(\boldsymbol{u}-\boldsymbol{v}).$

By using Cauchy-Schwarz inequality, the above is absolutely integrable, thus, we can interchange the summations and get

$\displaystyle A_{1}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}^{2}}d\boldsymbol{x}d\boldsymbol{y}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})C(\boldsymbol{x}-\boldsymbol{y})\int_{D_{n}^{2}}d\boldsymbol{u}d\boldsymbol{v}h(\boldsymbol{u}/\boldsymbol{A})h(\boldsymbol{v}/\boldsymbol{A})C(\boldsymbol{u}-\boldsymbol{v})$
		$\displaystyle\times\int_{\mathbb{R}^{d}}\phi_{M}(\boldsymbol{\omega}_{1})e^{i(\boldsymbol{u}-\boldsymbol{x})^{\top}\boldsymbol{\omega}_{1}}d\boldsymbol{\omega}_{1}\int_{\mathbb{R}^{d}}\phi_{M}(\boldsymbol{\omega}_{2})e^{i(\boldsymbol{y}-\boldsymbol{v})^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{\omega}_{2}$
	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}^{4}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})h(\boldsymbol{u}/\boldsymbol{A})h(\boldsymbol{v}/\boldsymbol{A})$
		$\displaystyle\times C(\boldsymbol{x}-\boldsymbol{y})C(\boldsymbol{u}-\boldsymbol{v})\widehat{\phi}_{M}(\boldsymbol{u}-\boldsymbol{x})\widehat{\phi}_{M}(\boldsymbol{y}-\boldsymbol{v})d\boldsymbol{x}d\boldsymbol{y}d\boldsymbol{u}d\boldsymbol{v}$
	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\int_{D_{n}-D_{n}}\int_{D_{n}-D_{n}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})T_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2})d\boldsymbol{t}_{1}d\boldsymbol{t}_{2},$

where $T_{n}(\boldsymbol{t}_{1},\boldsymbol{t}_{2})$ is from (D.6). In the above, we use an inverse transform of (A.4) in the second identity and the change of variables $\boldsymbol{t}_{1}=\boldsymbol{u}-\boldsymbol{x}$ and $\boldsymbol{t}_{2}=\boldsymbol{v}-\boldsymbol{y}$ in the last identity.

Next, recall $\widetilde{f}$ in (D.7). Then, by Assumption 4.1(i), $\widetilde{f}\in L^{1}(\mathbb{R}^{d})$ and has the Fourier representation as in (4.4). We note that $B_{M}\subset D_{n}-D_{n}$ for large enough $n\in\mathbb{N}$ due to Assumption 3.1. Thus, by using this fact together with Lemma D.2, for large enough $n\in\mathbb{N}$ , we have $A_{1}=A_{11}+2A_{12}+A_{13}$ , where

	$\displaystyle A_{11}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\lambda^{2}\int_{B_{M}^{2}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left(H_{h,4}+O(\|D_{n}\|^{-1/d})\\|\boldsymbol{t}_{2}\\|\right)d\boldsymbol{t}_{1}d\boldsymbol{t}_{2},$
	$\displaystyle A_{12}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\lambda\int_{B_{M}^{2}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\gamma_{2,\emph{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})\left[H_{h,4}+O(\|D_{n}\|^{-1/d})(\\|\boldsymbol{t}_{1}\\|+\\|\boldsymbol{t}_{2}\\|)\right]d\boldsymbol{t}_{1}d\boldsymbol{t}_{2},$

as $n\rightarrow\infty$ , and

	$\displaystyle A_{13}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\int_{B_{M}^{2}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\int_{\mathbb{R}^{2d}}e^{i(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})^{\top}\boldsymbol{\omega}_{2}}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})$
			$\displaystyle~{}~{}\times\left((2\pi)^{d}H_{h,4}F_{h^{2},n}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})+\|D_{n}\|^{-1}R_{n}(\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{t}_{1},\boldsymbol{t}_{2})\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{t}_{1}d\boldsymbol{t}_{2},\quad n\rightarrow\infty,$

where $F_{h^{2},n}$ is Fejér kernel based on $h^{2}$ defined as in (D.9) and $R_{n}(\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2},\boldsymbol{t}_{1},\boldsymbol{t}_{2})$ is the remainder term defined as in (D.8).

We bound each term above. The first term is

	$\displaystyle A_{11}$	$\displaystyle=(H_{h,4}/H_{h,2}^{2})R^{2}\int_{B_{M}}\|\widehat{\phi}_{M}(\boldsymbol{t}_{1})\|^{2}d\boldsymbol{t}_{1}+O(\|D_{n}\|^{-1/d})$		(D.11)
		$\displaystyle=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})R^{2}\int_{\mathbb{R}^{d}}\|\phi_{M}(\boldsymbol{\omega})\|^{2}d\boldsymbol{\omega}+o(1),\quad n\rightarrow\infty,$		(D.11)

where $R=(2\pi)^{-d}\lambda$ . Here, the last identity is due to Plancherel theorem. By using (4.4) and (A.4) together with Fubini’s theorem, the second term is

		$\displaystyle A_{12}$		(D.12)
		$\displaystyle=(2\pi)^{-d}(H_{h,4}/H_{h,2}^{2})R\int_{B_{M}^{2}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})\widehat{\phi}_{M}(-\boldsymbol{t}_{2})\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega})e^{i\boldsymbol{\omega}^{\top}(\boldsymbol{t}_{1}-\boldsymbol{t}_{2})}d\boldsymbol{\omega}d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}+O(\|D_{n}\|^{-1/d})$
		$\displaystyle=(2\pi)^{-d}(H_{h,4}/H_{h,2}^{2})R\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega})\left(\int_{B_{M}}\widehat{\phi}_{M}(-\boldsymbol{t}_{2})e^{-i\boldsymbol{\omega}^{\top}\boldsymbol{t}_{2}}d\boldsymbol{t}_{2}\right)\left(\int_{B_{M}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})e^{i\boldsymbol{\omega}^{\top}\boldsymbol{t}_{1}}d\boldsymbol{t}_{1}\right)d\boldsymbol{\omega}+o(1)$
		$\displaystyle=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})R\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega})\phi_{M}(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}+o(1),\quad n\rightarrow\infty.$

Finally, by using Fubini’s theorem and Lemmas C.3(b) and D.3, the third term is

$\displaystyle A_{13}$	$\displaystyle=(2\pi)^{-d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}\widetilde{f}(\boldsymbol{\omega}_{1})\widetilde{f}(\boldsymbol{\omega}_{2})F_{h^{2},n}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})$	(D.13)
	$\displaystyle~{}~{}\times\int_{B_{M}}\widehat{\phi}_{M}(\boldsymbol{t}_{1})e^{i\boldsymbol{t}_{1}^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{t}_{1}\int_{B_{M}}\widehat{\phi}_{M}(-\boldsymbol{t}_{2})e^{-i\boldsymbol{t}_{2}^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{t}_{2}+o(1)$
	$\displaystyle=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega}_{2})\phi_{M}(-\boldsymbol{\omega}_{2})^{2}\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega}_{1})F_{h^{2},n}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}+o(1)$
	$\displaystyle=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}\widetilde{f}(\boldsymbol{\omega}_{2})^{2}\phi_{M}(\boldsymbol{\omega}_{2})^{2}d\boldsymbol{\omega}_{2}+o(1),\qquad n\rightarrow\infty.$

Combining (D.11)–(D.13), we conclude

	$\displaystyle\lim_{n\rightarrow\infty}A_{1}$	$\displaystyle=$	$\displaystyle(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}\left(\widetilde{f}(\boldsymbol{\omega})^{2}+2R\widetilde{f}(\boldsymbol{\omega})+R^{2}\right)\phi_{M}(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}$
		$\displaystyle=$	$\displaystyle(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{d}}f(\boldsymbol{\omega})^{2}\phi_{M}(\boldsymbol{\omega})^{2}d\boldsymbol{\omega}.$

Thus, we prove the theorem. $\Box$

D.3 Bounds on the terms involving higher order cumulants

We give an expression of the complete fourth-order cumulant of the DFTs. Through a simple combinatorial argument, the fourth-order complete reduced cumulant, denoted $\kappa_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})$ , can be written as a sum of the 15 different reduced cumulant functions:

$\displaystyle\kappa_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})$	$\displaystyle=\gamma_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})+\bigg{[}\gamma_{3,\text{red}}(\boldsymbol{x},\boldsymbol{z})\delta(\boldsymbol{x}-\boldsymbol{y})+\gamma_{3,\text{red}}(\boldsymbol{x},\boldsymbol{y})\delta(\boldsymbol{x}-\boldsymbol{z})$	(D.14)
	$\displaystyle\qquad\qquad+\gamma_{3,\text{red}}(\boldsymbol{x},\boldsymbol{y})\delta(\boldsymbol{y}-\boldsymbol{z})+\gamma_{3,\text{red}}(\boldsymbol{x}-\boldsymbol{z},\boldsymbol{y}-\boldsymbol{z})\left\{\delta(\boldsymbol{x})+\delta(\boldsymbol{y})+\delta(\boldsymbol{z})\right\}\bigg{]}$
	$\displaystyle+\bigg{[}\gamma_{2,\text{red}}(\boldsymbol{x})\left(\delta(\boldsymbol{y})\delta(\boldsymbol{z})+\delta(\boldsymbol{x}-\boldsymbol{y})\delta(\boldsymbol{x}-\boldsymbol{z})+\delta(\boldsymbol{x}-\boldsymbol{y})\delta(\boldsymbol{z})+\delta(\boldsymbol{x}-\boldsymbol{z})\delta(\boldsymbol{y})\right)$
	$\displaystyle\qquad+\gamma_{2,\text{red}}(\boldsymbol{y})\left(\delta(\boldsymbol{x})\delta(\boldsymbol{z})+\delta(\boldsymbol{x})\delta(\boldsymbol{y}-\boldsymbol{z})\right)+\gamma_{2,\text{red}}(\boldsymbol{z})\delta(\boldsymbol{x})\delta(\boldsymbol{y})\bigg{]}$
	$\displaystyle+\lambda\delta(\boldsymbol{x})\delta(\boldsymbol{y})\delta(\boldsymbol{z}),\qquad\boldsymbol{x},\boldsymbol{y},\boldsymbol{z}\in\mathbb{R}^{d}.$

Lemma D.4.

Let $X$ be a fourth-order stationary point process on $\mathbb{R}^{d}$ and let $\kappa_{4}$ be the fourth-order complete cumulant density function defined as in (D.14). Suppose that Assumption 3.2 holds for $\ell=4$ . Then, for $\boldsymbol{\omega}_{1},\dots,\boldsymbol{\omega}_{4}\in\mathbb{R}^{d}$ ,

	$\displaystyle\mathrm{cum}\left(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}),J_{h,n}(\boldsymbol{\omega}_{3}),J_{h,n}(\boldsymbol{\omega}_{4})\right)=(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-2}$
	$\displaystyle~{}~{}\times\int_{D_{n}^{4}}\left(\prod_{j=1}^{4}h(\boldsymbol{t}_{j}/\boldsymbol{A})\right)\exp(-i\sum_{j=1}^{4}\boldsymbol{t}_{j}^{\top}\boldsymbol{\omega}_{j})\kappa_{4}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})\prod_{j=1}^{4}d\boldsymbol{t}_{j}.$

Proof. The proof is similar to that of the proof of Theorem D.1. We omit the details. $\Box$

Using the above expression, we calculate the limit of $A_{3}$ in Section A.3. Recall

A_{3}=|D_{n}|\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\omega}_{1})\phi_{M}(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}),J_{h,n}(-\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.

Theorem D.4.

Suppose the same set of assumptions in Theorem 4.1(ii) holds. Then,

\lim_{n\rightarrow\infty}A_{3}=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\lambda}_{1})\phi_{M}(\boldsymbol{\lambda}_{3})f_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3},

where $f_{4}$ is the fourth-order spectrum.

Proof. By using Lemma D.4 and Fubini’s theorem, $A_{3}$ can be written as

$\displaystyle A_{3}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}^{4}}\left(\prod_{j=1}^{4}h(\boldsymbol{t}_{j}/\boldsymbol{A})\right)\kappa_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})$
		$\displaystyle\times\left(\int_{\mathbb{R}^{d}}\phi_{M}(\boldsymbol{\omega}_{1})e^{i(\boldsymbol{t}_{2}-\boldsymbol{t}_{1})^{\top}\boldsymbol{\omega}_{1}}d\boldsymbol{\omega}_{1}\right)\left(\int_{\mathbb{R}^{d}}\phi_{M}(\boldsymbol{\omega}_{2})e^{i(\boldsymbol{t}_{4}-\boldsymbol{t}_{3})^{\top}\boldsymbol{\omega}_{2}}d\boldsymbol{\omega}_{2}\right)d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}$
	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}^{4}}\left(\prod_{j=1}^{4}h(\boldsymbol{t}_{j}/\boldsymbol{A})\right)\kappa_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})$
		$\displaystyle~{}~{}\times\widehat{\phi}_{M}(\boldsymbol{t}_{2}-\boldsymbol{t}_{1})\widehat{\phi}_{M}(\boldsymbol{t}_{4}-\boldsymbol{t}_{3})d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}.$

Let $\widetilde{\kappa}_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})=\kappa_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})-\lambda\delta(\boldsymbol{x})\delta(\boldsymbol{y})\delta(\boldsymbol{z})$ . Then, from (D.14) and (4.3), the inverse Fourier transform of $\widetilde{\kappa}_{4,\text{red}}$ is $\widetilde{f}_{4}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})=f_{4}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})-(2\pi)^{-3d}\lambda$ . By using a similar decomposition as in Lemma D.2, we have $A_{3}=A_{31}+A_{32}$ , where

$\displaystyle A_{31}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\lambda\int_{D_{n}^{4}}\left(\prod_{j=1}^{4}h(\boldsymbol{t}_{j}/\boldsymbol{A})\right)\delta(\boldsymbol{t}_{1}-\boldsymbol{t}_{4})\delta(\boldsymbol{t}_{2}-\boldsymbol{t}_{4})\delta(\boldsymbol{t}_{3}-\boldsymbol{t}_{4})$
		$\displaystyle~{}~{}\times\widehat{\phi}_{M}(\boldsymbol{t}_{2}-\boldsymbol{t}_{1})\widehat{\phi}_{M}(\boldsymbol{t}_{4}-\boldsymbol{t}_{3})d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}$
	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\lambda\widehat{\phi}_{M}(\textbf{0})^{2}\int_{D_{n}}h(\boldsymbol{t}/\boldsymbol{A})^{4}d\boldsymbol{t}=(2\pi)^{-2d}(H_{h,4}/H_{h,2}^{2})\lambda\widehat{\phi}_{M}(\textbf{0})^{2}$

and

$\displaystyle A_{32}$	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}^{4}}\left(\prod_{j=1}^{4}h(\boldsymbol{t}_{j}/\boldsymbol{A})\right)\widetilde{\kappa}_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})$
		$\displaystyle~{}~{}\times\widehat{\phi}_{M}(\boldsymbol{t}_{2}-\boldsymbol{t}_{1})\widehat{\phi}_{M}(\boldsymbol{t}_{4}-\boldsymbol{t}_{3})d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}$
	$\displaystyle=$	$\displaystyle(2\pi)^{-2d}H_{h,2}^{-2}\|D_{n}\|^{-1}\int_{D_{n}-D_{n}}\int_{D_{n}-D_{n}}d\boldsymbol{u}d\boldsymbol{v}\widehat{\phi}_{M}(\boldsymbol{u})\widehat{\phi}_{M}(\boldsymbol{v})\int_{\mathbb{R}^{2d}}h(\boldsymbol{t}_{1}/\boldsymbol{A})h((\boldsymbol{t}_{1}+\boldsymbol{u})/\boldsymbol{A})$
		$\displaystyle~{}~{}\times h(\boldsymbol{t}_{3}/\boldsymbol{A})h((\boldsymbol{t}_{3}+\boldsymbol{v})/\boldsymbol{A})\widetilde{\kappa}_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{3}-\boldsymbol{v},\boldsymbol{t}_{1}-\boldsymbol{t}_{3}+\boldsymbol{u}-\boldsymbol{v},-\boldsymbol{v})d\boldsymbol{t}_{1}d\boldsymbol{t}_{3}.$

Here, we use change of variables $\boldsymbol{u}=\boldsymbol{t}_{2}-\boldsymbol{t}_{1}$ and $\boldsymbol{v}=\boldsymbol{t}_{4}-\boldsymbol{t}_{3}$ in the second identity above. To obtain an expression of $A_{31}$ , by using (A.3), $\widehat{\phi}_{M}(\textbf{0})^{2}=\widehat{\phi}(\textbf{0})^{2}=\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$ . Therefore,

A_{31}=(2\pi)^{-2d}(H_{h,4}/H_{h,2}^{2})\lambda\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.

(D.15)

Obtaining an expression for $\lim_{n\rightarrow\infty}A_{32}$ is similar to that in deriving an expression for $\lim_{n\rightarrow\infty}A_{13}$ above. By using similar techniques as in Lemmas D.2 and D.3, it can be shown that for large $n\in\mathbb{N}$ ,

$\displaystyle A_{32}$	$\displaystyle=$	$\displaystyle(2\pi)^{-d}(H_{h,4}/H_{h,2}^{2})\int_{B_{M}^{2}}d\boldsymbol{u}d\boldsymbol{v}\widehat{\phi}_{M}(\boldsymbol{u})\widehat{\phi}_{M}(\boldsymbol{v})\int_{\mathbb{R}^{3d}}\widetilde{f}_{4}(\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{2},\boldsymbol{\lambda}_{3})F_{h^{2},n}(\boldsymbol{\lambda}_{1}+\boldsymbol{\lambda}_{2})$
		$\displaystyle~{}~{}\times\exp\left(i(-\boldsymbol{\lambda}_{1}^{\top}\boldsymbol{v}+\boldsymbol{\lambda}_{2}^{\top}(\boldsymbol{u}-\boldsymbol{v})-\boldsymbol{\lambda}_{3}^{\top}\boldsymbol{v})\right)d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{2}d\boldsymbol{\lambda}_{3}+o(1)$
	$\displaystyle=$	$\displaystyle(2\pi)^{-d}(H_{h,4}/H_{h,2}^{2})\int_{B_{M}^{2}}d\boldsymbol{u}d\boldsymbol{v}\widehat{\phi}_{M}(\boldsymbol{u})\widehat{\phi}_{M}(\boldsymbol{v})\int_{\mathbb{R}^{2d}}\widetilde{f}_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})e^{-i(\boldsymbol{\lambda}_{1}^{\top}\boldsymbol{u}+\boldsymbol{\lambda}_{3}^{\top}\boldsymbol{v})}d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}+o(1)$
	$\displaystyle=$	$\displaystyle(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}\widetilde{f}_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})\phi_{M}(\boldsymbol{\lambda}_{1})\phi_{M}(\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}+o(1),\quad n\rightarrow\infty.$

Here, we use Lemma C.3(b) in the second identity and Fubini’s theorem and (A.4) in the last identity. Therefore,

\lim_{n\rightarrow\infty}A_{32}=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\lambda}_{1})\phi_{M}(\boldsymbol{\lambda}_{3})\widetilde{f}_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}.

(D.16)

Combining (D.15) and (D.16), we have

\lim_{n\rightarrow\infty}A_{3}=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})\int_{\mathbb{R}^{2d}}\phi_{M}(\boldsymbol{\lambda}_{1})\phi_{M}(\boldsymbol{\lambda}_{3})f_{4}(\boldsymbol{\lambda}_{1},-\boldsymbol{\lambda}_{1},\boldsymbol{\lambda}_{3})d\boldsymbol{\lambda}_{1}d\boldsymbol{\lambda}_{3}.

Thus, we obtain the limit of $A_{3}$ . $\Box$

To end this section, we obtain the bounds for the general higher order cumulant term.

Lemma D.5.

Suppose that Assumptions 3.1 and 3.2 (for some $\ell\in\{2,3,\dots\}$ ) hold. Let

\mathcal{Z}_{j,n}=|D_{n}|^{-\alpha_{j}}\sum_{\boldsymbol{x}\in X\cap D_{n}}g_{j,n}(\boldsymbol{x}),\quad j\in\{1,\dots,\ell\},

where $\alpha_{j}\in[0,\infty)$ and $g_{j,n}(\boldsymbol{x})$ is a bounded function on $\mathbb{R}^{d}$ uniformly in $n\in\mathbb{N}$ and $\boldsymbol{x}\in\mathbb{R}^{d}$ . Then, we have

\big{|}\mathrm{cum}(\mathcal{Z}_{1,n},\dots,\mathcal{Z}_{j,n})\big{|}=O(|D_{n}|^{-\sum_{k=1}^{j}\alpha_{k}+1}),\qquad j\in\{2,\dots,\ell\}.

Proof. We will only show the above for $(j,\ell)=(4,4)$ under fourth-order stationarity. The general cases are treated similarly (see the statement after Assumption 3.2). Let $\alpha=\sum_{k=1}^{4}\alpha_{k}$ . By generalizing Lemma D.4, we have

\mathrm{cum}(\mathcal{Z}_{1,n},\dots,\mathcal{Z}_{4,n})=|D_{n}|^{-\alpha}\int_{D_{n}^{4}}\prod_{j=1}^{4}g_{j,n}(\boldsymbol{t}_{j})\times\kappa_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4},

where $\kappa_{4,\text{red}}(\cdot,\cdot,\cdot)$ is defined as in (D.14). Let $\sup_{n\in\mathbb{N}}\sup_{\boldsymbol{x}\in\mathbb{R}^{d}}|g_{j,n}(\boldsymbol{x})|<C_{j}<\infty$ , $j\in\{1,2,3,4\}$ , and let $C=\max\{C_{1},\dots,C_{4}\}$ . Then, we have

$\displaystyle\|\mathrm{cum}(\mathcal{Z}_{1,n},\dots,\mathcal{Z}_{4,n})\|$	$\displaystyle\leq$	$\displaystyle C^{4}\|D_{n}\|^{-\alpha}\int_{D_{n}^{4}}\|\kappa_{4,\text{red}}(\boldsymbol{t}_{1}-\boldsymbol{t}_{4},\boldsymbol{t}_{2}-\boldsymbol{t}_{4},\boldsymbol{t}_{3}-\boldsymbol{t}_{4})\|d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}$
	$\displaystyle\leq$	$\displaystyle C^{4}\|D_{n}\|^{-\alpha}\int_{D_{n}}\int_{D_{n}-\boldsymbol{t}_{4}}\int_{D_{n}-\boldsymbol{t}_{4}}\int_{D_{n}-\boldsymbol{t}_{4}}\|\kappa_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})\|d\boldsymbol{x}d\boldsymbol{y}d\boldsymbol{z}d\boldsymbol{t}_{4}$
	$\displaystyle\leq$	$\displaystyle C^{4}\|D_{n}\|^{-\alpha}\left(\int_{D_{n}}d\boldsymbol{t}_{4}\right)\left(\int_{\mathbb{R}^{3d}}\|\kappa_{4,\text{red}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z})\|d\boldsymbol{x}d\boldsymbol{y}d\boldsymbol{z}\right)=O(\|D_{n}\|^{-\alpha+1}).$

Here, we use change of variables $\boldsymbol{x}=\boldsymbol{t}_{1}-\boldsymbol{t}_{4}$ , $\boldsymbol{y}=\boldsymbol{t}_{2}-\boldsymbol{t}_{4}$ , and $\boldsymbol{z}=\boldsymbol{t}_{3}-\boldsymbol{t}_{4}$ in the second inequality, and the last identity is due to absolute integrability of $\kappa_{4,\text{red}}(\cdot,\cdot,\cdot)$ under Assumption 3.2 for $\ell=4$ . Thus, we prove the lemma for $(j,\ell)=(4,4)$ under fourth-order stationarity. $\Box$

Appendix E Asymptotic equivalence of the periodograms

In this section, we obtain some cumulant bounds that appear in the first and second moments of the different $\widehat{I}_{h,n}(\boldsymbol{\omega})-I_{h,n}(\boldsymbol{\omega})$ . These bounds are mainly used to prove an asymptotic equivalence between the feasible and infeasible integrated periodograms (see Theorem A.1). However, the results derived in this section may also be of independent interest in periodogram-based methods for spatial point process. Throughout the section, we let $C\in(0,\infty)$ be a generic constant that varies line by line. Recall (2.13):

		$\displaystyle\widehat{I}_{h,n}(\boldsymbol{\omega})-I_{h,n}(\boldsymbol{\omega})$		(E.1)
		$\displaystyle~{}~{}=-(\widehat{\lambda}_{h,n}-\lambda)\left(c_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega})+c_{h,n}(-\boldsymbol{\omega})J_{h,n}(\boldsymbol{\omega})\right)+(\widehat{\lambda}_{h,n}-\lambda)^{2}\|c_{h,n}(\boldsymbol{\omega})\|^{2}$
		$\displaystyle~{}~{}=R_{1}(\boldsymbol{\omega})+R_{2}(\boldsymbol{\omega}).$

In the following lemma, we bound the cumulants that are made of $\widehat{\lambda}_{h,n}$ and

K_{h,n}(\boldsymbol{\omega})=c_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega})+c_{h,n}(-\boldsymbol{\omega})J_{h,n}(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(E.2)

Lemma E.1.

Suppose that Assumptions 3.2(for $\ell=2$ ) and 3.4(i) hold. Then, we obtain the following two bounds:

(a)

For $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , $|\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}))|\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega})||c_{h^{2},n}(\boldsymbol{\omega})|+|D_{n}|^{-1/2}|c_{h,n}(\boldsymbol{\omega})|o(1)$ as $n\rightarrow\infty$ , where $o(1)$ error is uniform over $\boldsymbol{\omega}\in\mathbb{R}^{d}$ .
(b)

$\mathrm{cum}(\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n})=\mathrm{var}(\widehat{\lambda}_{h,n})\leq C|D_{n}|^{-1}$ .

If we further assume Assumption 3.2 holds for $\ell=4$ . Then, we have

(c)

For $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ , $|\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{1}),\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{2}))|\leq C|c_{h,n}(\boldsymbol{\omega}_{1})||c_{h,n}(\boldsymbol{\omega}_{2})||D_{n}|^{-2}$ .
(d)

$|\mathrm{cum}(\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n})|\leq C|D_{n}|^{-3}$ .
(e)

$\mathrm{cum}((\widehat{\lambda}_{h,n}-\lambda)^{2},(\widehat{\lambda}_{h,n}-\lambda)^{2})=\mathrm{var}((\widehat{\lambda}_{h,n}-\lambda)^{2})\leq C|D_{n}|^{-2}$ .

Proof. Recall $\widehat{\lambda}_{h,n}=H_{h,1}^{-1}|D_{n}|^{-1}\sum_{\boldsymbol{x}\in X\cap D_{n}}h(\boldsymbol{x}/\boldsymbol{A})$ . (b) and (d) are straightforward due to Lemma D.5. To show (e), we note that

\mathrm{cum}((\widehat{\lambda}_{h,n}-\lambda)^{2},(\widehat{\lambda}_{h,n}-\lambda)^{2})=2\mathrm{var}(\widehat{\lambda}_{h,n})^{2}+\mathrm{cum}(\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n},\widehat{\lambda}_{h,n}).

Thus, (e) follows from (b) and (d).

Next, we will show (a). We first note that $\widehat{\lambda}_{h,n}=(2\pi)^{d/2}H_{h,2}^{1/2}H_{h,1}^{-1}|D_{n}|^{-1/2}\mathcal{J}_{h,n}(\textbf{0})$ . Therefore, by using Theorem D.2,

	$\displaystyle\mathrm{cum}(\widehat{\lambda}_{h,n},c_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega}))$	$\displaystyle=$	$\displaystyle(2\pi)^{d/2}H_{h,2}^{1/2}H_{h,1}^{-1}\|D_{n}\|^{-1/2}c_{h,n}(\boldsymbol{\omega})\mathrm{cum}(J_{h,n}(\textbf{0}),J_{h,n}(-\boldsymbol{\omega}))$
		$\displaystyle=$	$\displaystyle C\|D_{n}\|^{-3/2}c_{h,n}(\boldsymbol{\omega})f(\textbf{0})H_{h,2}^{(n)}(-\boldsymbol{\omega})+\|D_{n}\|^{-1/2}c_{h,n}(\boldsymbol{\omega})o(1),~{}~{}n\rightarrow\infty,$

where $f$ is the spectral density and $o(1)$ error above is uniform over $\boldsymbol{\omega}\in\mathbb{R}^{d}$ . Since $f(\textbf{0})<\infty$ under Assumption 3.2(for $\ell=2$ ) and by using (2.10), we have

|\mathrm{cum}(\widehat{\lambda}_{h,n},c_{h,n}(\boldsymbol{\omega})J_{h,n}(-\boldsymbol{\omega}))|\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega})||c_{h^{2},n}(\boldsymbol{\omega})|+|D_{n}|^{-1/2}|c_{h,n}(\boldsymbol{\omega})|o(1),\quad n\rightarrow\infty.

Similarly, we have $|\mathrm{cum}(\widehat{\lambda}_{h,n},c_{h,n}(-\boldsymbol{\omega})J_{h,n}(\boldsymbol{\omega}))|\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega})||c_{h^{2},n}(\boldsymbol{\omega})|+|D_{n}|^{-1/2}|c_{h,n}(\boldsymbol{\omega})|o(1)$ as $n\rightarrow\infty$ . Thus, we show (a).

Lastly, (c) is straightforward due to Lemma D.5. All together, we get the desired results. $\Box$

Using the bounds derived above, we obtain the bounds for the first-order moments for $R_{1}(\boldsymbol{\omega})$ and $R_{2}(\boldsymbol{\omega})$ and the second-order moment for $R_{2}(\boldsymbol{\omega})$ .

Theorem E.1.

Suppose that Assumptions 3.2(for $\ell=2$ ) and 3.4(i) hold. Let $R_{1}(\cdot)$ and $R_{2}(\cdot)$ be defined as in (E.1). Then, for $\boldsymbol{\omega}\in\mathbb{R}^{d}$ , we have

	$\displaystyle\left\|\mathbb{E}[R_{1}(\boldsymbol{\omega})]\right\|$	$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{-1}\|c_{h,n}(\boldsymbol{\omega})\|\|c_{h^{2},n}(\boldsymbol{\omega})\|+\|D_{n}\|^{-1/2}\|c_{h,n}(\boldsymbol{\omega})\|o(1)$		(E.3)
	$\displaystyle\text{and}\quad\left\|\mathbb{E}[R_{2}(\boldsymbol{\omega})]\right\|$	$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{-1}\|c_{h,n}(\boldsymbol{\omega})\|^{2},\quad n\rightarrow\infty,$		(E.4)

where $o(1)$ error above is uniform over $\boldsymbol{\omega}\in\mathbb{R}^{d}$ .

If we further assume Assumption 3.2 for $\ell=4$ . Then, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ ,

|\mathrm{cov}(R_{2}(\boldsymbol{\omega}_{1}),R_{2}(\boldsymbol{\omega}_{2}))|\leq C|D_{n}|^{-2}|c_{h,n}(\boldsymbol{\omega}_{1})|^{2}|c_{h,n}(\boldsymbol{\omega}_{2})|^{2}.

(E.5)

Proof. To show (E.3), we use Lemma E.1(a) and get

\left|\mathbb{E}[R_{1}(\boldsymbol{\omega})]\right|=|\mathrm{cum}(\widehat{\lambda}_{n},K_{h,n}(\boldsymbol{\omega}))|\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega})||c_{h^{2},n}(\boldsymbol{\omega})|+|D_{n}|^{-1/2}|c_{h,n}(\boldsymbol{\omega})|o(1)

as $n\rightarrow\infty$ . Thus, we show (E.3).

To show (E.4), by Lemma E.1(b),

|\mathbb{E}[R_{2}(\boldsymbol{\omega})]|=|c_{h,n}(\boldsymbol{\omega})|^{2}\mathrm{cum}(\widehat{\lambda}_{n},\widehat{\lambda}_{n})\leq C|D_{n}|^{-1}|c_{h,n}(\boldsymbol{\omega})|^{2},~{}~{}\boldsymbol{\omega}\in\mathbb{R}^{d}.

Thus, we show (E.4).

To show (E.5), by Lemma E.1(e), we have

	$\displaystyle\mathrm{cov}(R_{2}(\boldsymbol{\omega}_{1}),R_{2}(\boldsymbol{\omega}_{2}))$
	$\displaystyle~{}~{}=\|c_{h,n}(\boldsymbol{\omega}_{1})\|^{2}\|c_{h,n}(\boldsymbol{\omega}_{2})\|^{2}\mathrm{var}((\widehat{\lambda}_{n}-\lambda)^{2})\leq C\|D_{n}\|^{-2}\|c_{h,n}(\boldsymbol{\omega}_{1})\|^{2}\|c_{h,n}(\boldsymbol{\omega}_{2})\|^{2},~{}~{}\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}.$

Thus, we show (E.5). All together, we prove the theorem. $\Box$

Now, let

S_{i}=|D_{n}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})R_{i}(\boldsymbol{\omega})d\boldsymbol{\omega},\quad i\in\{1,2\},

where $D$ is a compact region on $\mathbb{R}^{d}$ and $\phi$ is a symmetric continuous function on $D$ . In the following theorem, we show that $S_{1}$ and $S_{2}$ are asympototically negligible.

Theorem E.2.

Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), and 3.4(i) hold. Then,

S_{1},S_{2}\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0,\quad n\rightarrow\infty,

where $\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}$ denotes convergence in $L_{2}$ .

Proof. We first calculate the expectations. By using (E.4) and Lemma C.3(d), we have

	$\displaystyle\|\mathbb{E}[S_{2}]\|$	$\displaystyle\leq\|D_{n}\|^{1/2}\int_{D}\|\phi(\boldsymbol{\omega})\|\|\mathbb{E}[R_{2}(\boldsymbol{\omega})]\|d\boldsymbol{\omega}$		(E.6)
		$\displaystyle\leq C\|D_{n}\|^{-1/2}\int_{D}\|\phi(\boldsymbol{\omega})\|\|c_{h,n}(\boldsymbol{\omega})\|^{2}d\boldsymbol{\omega}=O(\|D_{n}\|^{-1/2}),~{}~{}n\rightarrow\infty.$		(E.6)

Similarly, by using (E.3) and Lemma C.3(d),(e), we have

$\displaystyle\|\mathbb{E}[S_{1}]\|$	$\displaystyle\leq\|D_{n}\|^{1/2}\int_{D}\|\phi(\boldsymbol{\omega})\|\|\mathbb{E}[R_{1}(\boldsymbol{\omega})]\|d\boldsymbol{\omega}$	(E.7)
	$\displaystyle\leq C\|D_{n}\|^{-1/2}\int_{D}\|\phi(\boldsymbol{\omega})\|\|c_{h,n}(\boldsymbol{\omega})\|\|c_{h^{2},n}(\boldsymbol{\omega})\|d\boldsymbol{\omega}+o(1)\int_{D}\|\phi(\boldsymbol{\omega})\|\|c_{h,n}(\boldsymbol{\omega})\|d\boldsymbol{\omega}$
	$\displaystyle=o(1),~{}~{}n\rightarrow\infty.$

Next, we calculate the variances. By using (E.5) and Lemma C.3(d), $\mathrm{var}(S_{2})$ is bounded by

	$\displaystyle\mathrm{var}(S_{2})$	$\displaystyle\leq\|D_{n}\|\int_{D^{2}}\|\phi(\boldsymbol{\omega}_{1})\|\|\phi(\boldsymbol{\omega}_{2})\|\|\mathrm{cov}(R_{2}(\boldsymbol{\omega}_{1}),R_{2}(\boldsymbol{\omega}_{2}))\|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$		(E.8)
		$\displaystyle\leq C\|D_{n}\|^{-1}\left(\int_{D}\|\phi(\boldsymbol{\omega}_{1})\|\|c_{h,n}(\boldsymbol{\omega})\|^{2}d\boldsymbol{\omega}_{1}\right)^{2}=O(\|D_{n}\|^{-1}),\quad n\rightarrow\infty.$		(E.8)

To bound $\mathrm{var}(S_{1})$ , we need more sophisticated calculations. By using indecomposable partitions, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ , we have

	$\displaystyle\mathrm{cov}(R_{1}(\boldsymbol{\omega}_{1}),R_{1}(\boldsymbol{\omega}_{2}))$
	$\displaystyle=\mathrm{cum}((\widehat{\lambda}_{h,n}-\lambda)K_{h,n}(\boldsymbol{\omega}_{1}),(\widehat{\lambda}_{h,n}-\lambda)K_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle=\mathrm{var}(\widehat{\lambda}_{h,n})\mathrm{cum}(K_{h,n}(\boldsymbol{\omega}_{1}),K_{h,n}(\boldsymbol{\omega}_{2}))+\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{1}))\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle~{}~{}+\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{1}),\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{2})).$

Thus, we have $\mathrm{var}(S_{1})=L_{1}+L_{2}+L_{3}$ , where

$\displaystyle L_{1}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\mathrm{var}(\widehat{\lambda}_{h,n})\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cum}(K_{h,n}(\boldsymbol{\omega}_{1}),K_{h,n}(\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$
$\displaystyle L_{2}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\left(\int_{D}\phi(\boldsymbol{\omega})\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}))d\boldsymbol{\omega}\right)^{2},~{}~{}\text{and}$
$\displaystyle L_{3}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cum}(\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{1}),\widehat{\lambda}_{h,n},K_{h,n}(\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

We will bound each term above. First, since $L_{2}=\mathbb{E}[S_{1}]^{2}$ , we have

L_{2}=|E[S_{1}]|^{2}=o(1),~{}~{}n\rightarrow\infty.

(E.9)

By using Lemmas C.3(e) and E.1(c), $L_{3}$ is bounded by

L_{3}\leq C|D_{n}|^{-1}\left(\int_{D}|\phi(\boldsymbol{\omega})||c_{h,n}(\boldsymbol{\omega})|d\boldsymbol{\omega}\right)^{2}=O(|D_{n}|^{-1}),\quad n\rightarrow\infty.

(E.10)

To bound $L_{1}$ , we only focus on the $c_{h,n}(\boldsymbol{\omega}_{1})c_{h,n}(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))$ term in the expansion of $\mathrm{cum}(K_{h,n}(\boldsymbol{\omega}_{1}),K_{h,n}(\boldsymbol{\omega}_{2}))$ and other three terms are treated similarly. By using Lemma E.1(b) and Theorem D.2, (a part of) $L_{1}$ is bounded by

C|D_{n}|^{-1}\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})c_{h,n}(\boldsymbol{\omega}_{1})c_{h,n}(\boldsymbol{\omega}_{2})\left(H_{h,2}^{(n)}(-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})+o(1)\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},\quad n\rightarrow\infty,

where $o(1)$ error is uniform over $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ . By using Lemma C.3(e), the second term above is $o(|D_{n}|^{-1})$ as $n\rightarrow\infty$ . Moreover, the first term is bounded by

	$\displaystyle C\|D_{n}\|^{-1/2}\int_{D^{2}}\|\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})c_{h,n}(\boldsymbol{\omega}_{1})c_{h,n}(\boldsymbol{\omega}_{2})\|\|c_{h^{2},n}(-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})\|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$
	$\displaystyle\quad=C\|D_{n}\|^{-1/2}\int_{D}d\boldsymbol{\omega}_{1}\|\phi(\boldsymbol{\omega}_{1})c_{h,n}(\boldsymbol{\omega}_{1})\|\int_{D}\|\phi(\boldsymbol{\omega}_{2})c_{h,n}(\boldsymbol{\omega}_{2})c_{h^{2},n}(-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})\|d\boldsymbol{\omega}_{2}$
	$\displaystyle\quad\leq C\|D_{n}\|^{-1/2}\int_{D}d\boldsymbol{\omega}_{1}\|\phi(\boldsymbol{\omega}_{1})c_{h,n}(\boldsymbol{\omega}_{1})\|=O(\|D_{n}\|^{-1/2}),\quad n\rightarrow\infty.$

Here, the inequality is due to Lemma C.3(d) and the second identity is due to Lemma C.3(e). All together, we conclude that

L_{1}=o(1),\qquad n\rightarrow\infty.

(E.11)

Combining (E.9), (E.10), and (E.11), we conclude

\mathrm{var}(S_{1})\leq L_{1}+L_{2}+L_{3}=o(1),\qquad n\rightarrow\infty.

(E.12)

Combining (E.7) and (E.12), we have $S_{1}\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0$ as $n\rightarrow\infty$ and (E.6) and (E.8) yield $S_{2}\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0$ as $n\rightarrow\infty$ . Thus, we get the desired results. $\Box$

Lastly, recall the theoretical counterpart of the kernel spectral density estimator $\widetilde{f}_{n,b}(\boldsymbol{\omega})$ in (B.6). As a consequence of the above theorem, we obtain the probabilistic bound for the difference $\widetilde{f}_{n,b}(\boldsymbol{\omega})-\widehat{f}_{n,b}(\boldsymbol{\omega})$ .

Corollary E.1.

Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), and 3.4(i) hold. Moreover, the bandwidth $b=b(n)$ is such that $\lim_{n\rightarrow\infty}b(n)=0$ , then

\sqrt{|D_{n}|b^{d}}(\widetilde{f}_{n,b}(\boldsymbol{\omega})-\widehat{f}_{n,b}(\boldsymbol{\omega}))\stackrel{{\scriptstyle L_{2}}}{{\rightarrow}}0,\quad n\rightarrow\infty.

(E.13)

Proof. Let $\phi_{b}(\boldsymbol{x})=W_{b}(\boldsymbol{\omega}-\boldsymbol{x})=b^{-d}W(b^{-1}(\boldsymbol{\omega}-\boldsymbol{x}))$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ . Then, $\sqrt{|D_{n}|b^{d}}(\widehat{f}_{n,b}(\boldsymbol{\omega})-\widetilde{f}_{n,b}(\boldsymbol{\omega}))=Q_{1}+Q_{2}$ , where

Q_{i}=\sqrt{|D_{n}|b^{d}}\int_{\mathbb{R}^{d}}\phi_{b}(\boldsymbol{x})R_{i}(\boldsymbol{x})d\boldsymbol{x},\quad i\in\{1,2\}.

By simple calculation, we have

\int_{\mathbb{R}^{d}}\phi_{b}(\boldsymbol{x})d\boldsymbol{x}=\int_{\mathbb{R}^{d}}W(\boldsymbol{x})d\boldsymbol{x}=1\quad\text{and}\quad\int_{\mathbb{R}^{d}}\phi_{b}(\boldsymbol{x})^{2}d\boldsymbol{x}=b^{-d}\int_{\mathbb{R}^{d}}W(\boldsymbol{x})^{2}d\boldsymbol{x}=Cb^{-d}.

Therefore, by using similar techniques to bound the first- and second-order moments of $S_{2}$ in Theorem E.2, we have

\displaystyle\mathbb{E}[Q_{2}]=O(|D_{n}|^{-1/2}b^{d/2})=o(1)\quad\text{and}\quad\mathrm{var}(Q_{2})=O(|D_{n}|^{-1}b^{d})=o(1),~{}~{}n\rightarrow\infty.

To bound the expectation of $Q_{1}$ , by using a similar argument as in (E.7), we have

	$\displaystyle\mathbb{E}[Q_{1}]$	$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{1/2}b^{d/2}\left(\|D_{n}\|^{-1}O(1)+\|D_{n}\|^{-1/2}o(1)\int_{\mathbb{R}^{d}}\|\phi_{b}(\boldsymbol{x})\|\|c_{h,n}(\boldsymbol{\omega})\|d\boldsymbol{x}\right)$
		$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{-1/2}b^{d/2}+o(1)=o(1),\quad n\rightarrow\infty.$

Here, we use a modification of Lemma C.3(e)

b^{d/2}\int_{\mathbb{R}^{d}}|\phi_{b}(\boldsymbol{x})||c_{h,n}(\boldsymbol{\omega})|d\boldsymbol{x}\leq C\left(b^{d}\int\phi_{b}(\boldsymbol{x})^{2}d\boldsymbol{x}\right)^{1/2}=O(1),\quad n\rightarrow\infty

in the second inequality above. Similarly, by using an expansion of $\mathrm{var}(S_{1})$ in the proof of Theorem E.2, one can easily seen that $\mathrm{var}(Q_{1})=o(1)$ as $n\rightarrow\infty$ . Therefore, both $Q_{1}$ and $Q_{2}$ converges to zero in $L_{2}$ as $n\rightarrow\infty$ . Thus, we prove the theorem. $\Box$

Appendix F Verification of the $\alpha$ -mixing CLT for the integrated periodogram

In this section, we will provide greater details the CLT for $\widetilde{G}_{h,n}(\phi)$ defined as in (A.9). With loss of generality, we assume that $|D_{n}|$ grows proportional to $n^{d}$ as $n\rightarrow\infty$ . Thus, $A_{1},\cdots,A_{d}$ increases with order $n$ . Next, let $\beta,\gamma\in(0,1)$ be chosen such that $2d/\varepsilon<\beta<\gamma<1$ , where $\varepsilon>2d$ is from Assumption 3.3(ii).. For $n\in\mathbb{N}$ , let

A_{n}=\{\boldsymbol{k}:\boldsymbol{k}\in n^{\gamma}\mathbb{Z}^{d}~{}~{}\text{and}~{}~{}D_{n}^{(\boldsymbol{k})}=\boldsymbol{k}+[-(n^{\gamma}-n^{\beta})/2,(n^{\gamma}-n^{\beta})/2]^{d}\subset D_{n}\}.

Thus, $\widetilde{D}_{n}=\bigcup_{\boldsymbol{k}\in A_{n}}D_{n}^{(\boldsymbol{k})}$ is a disjoint union that is included in $D_{n}$ . Let $k_{n}=|A_{n}|$ and let $E_{n}=D_{n}\backslash\widetilde{D}_{n}$ . Then, it can be easily seen that

\lim_{n\rightarrow\infty}\frac{|\widetilde{D}_{n}|}{|D_{n}|}=\lim_{n\rightarrow\infty}\left(1-\frac{|E_{n}|}{|D_{n}|}\right)=1.

(F.1)

F.1 Linearization of the integrated periodogram

Now, decompose the DFT using sub-blocks. Let

	$\displaystyle\mathcal{J}_{h,n}^{1}(\boldsymbol{\omega})$	$\displaystyle=$	$\displaystyle(2\pi)^{-d/2}H_{h,2}^{-1/2}\|\widetilde{D}_{n}\|^{-1/2}\sum_{\boldsymbol{x}\in X\cap\widetilde{D}_{n}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})$
	$\displaystyle\text{and}\quad\mathcal{J}_{h,n}^{2}(\boldsymbol{\omega})$	$\displaystyle=$	$\displaystyle(2\pi)^{-d/2}H_{h,2}^{-1/2}\|E_{n}\|^{-1/2}\sum_{\boldsymbol{x}\in X\cap E_{n}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}).$

Let $J_{h,n}^{i}(\boldsymbol{\omega})=\mathcal{J}_{h,n}^{i}(\boldsymbol{\omega})-\mathbb{E}[\mathcal{J}_{h,n}^{i}(\boldsymbol{\omega})]$ , $i\in\{1,2\}$ , be the centered DFTs. Then, since $D_{n}$ is a disjoint union of $\widetilde{D}_{n}$ and $E_{n}$ , we have

J_{h,n}(\boldsymbol{\omega})=\frac{|\widetilde{D}_{n}|^{1/2}}{|D_{n}|^{1/2}}J_{h,n}^{1}(\boldsymbol{\omega})+\frac{|E_{n}|^{1/2}}{|D_{n}|^{1/2}}J_{h,n}^{2}(\boldsymbol{\omega}),\quad n\in\mathbb{N}.

(F.2)

Furthermore, since $\widetilde{D}_{n}=\bigcup_{\boldsymbol{k}\in A_{n}}D_{n}^{(\boldsymbol{k})}$ is a disjoint union and $|\widetilde{D}_{n}|=k_{n}|D_{n}^{(\boldsymbol{k})}|$ . $J_{h,n}^{1}(\boldsymbol{\omega})$ can be written as

J_{h,n}^{1}(\boldsymbol{\omega})=k_{n}^{-1/2}\sum_{\boldsymbol{k}\in A_{n}}J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=k_{n}^{-1/2}\sum_{\boldsymbol{k}\in A_{n}}\left(\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})-\mathbb{E}[\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})]\right),

(F.3)

where

\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}^{(\boldsymbol{k})}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}^{(\boldsymbol{k})}}h(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{k}\in A_{n}.

(F.4)

Recall $\widetilde{G}_{h,n}(\phi)$ and $V_{h,n}(\phi)$ from (A.9) and (A.10), respectively. Our aim in this section is to show that $\widetilde{G}_{h,n}(\phi)-V_{h,n}(\phi)$ is asymptotically negligible. To do so, we introduce an intermediate statistic. For $n\in\mathbb{N}$ , let

S_{h,n}(\phi)=|\widetilde{D}_{n}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(|J^{1}_{h,n}(\boldsymbol{\omega})|^{2}-\mathbb{E}[|J^{1}_{h,n}(\boldsymbol{\omega})|^{2}]\right)d\boldsymbol{\omega}.

(F.5)

In the next two theorems, we will show that $\widetilde{G}_{h,n}(\phi)-S_{h,n}(\phi)$ and $S_{h,n}(\phi)-V_{h,n}(\phi)$ are asymptotically negligible.

Theorem F.1.

Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), and 4.1 hold. Suppose further the data taper $h$ is either constant on $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(for $m=d+1$ ). Then,

\widetilde{G}_{h,n}(\phi)-S_{h,n}(\phi)=o_{p}(1),\quad n\rightarrow\infty.

Proof. Since both $\widetilde{G}_{h,n}(\phi)$ and $S_{h,n}(\phi)$ are centered, we will only require to show that
$\lim_{n\rightarrow\infty}\mathrm{var}(\widetilde{G}_{h,n}(\phi)-S_{h,n}(\phi))=0$ . Let $\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega})=\frac{|\widetilde{D}_{n}|^{1/2}}{|D_{n}|^{1/2}}J_{h,n}^{1}(\boldsymbol{\omega})$ and $\widetilde{J}_{h,n}^{2}(\boldsymbol{\omega})=\frac{|E_{n}|^{1/2}}{|D_{n}|^{1/2}}J_{h,n}^{2}(\boldsymbol{\omega})$ . Then, we have

	$\displaystyle\mathrm{var}\left(\widetilde{G}_{h,n}(\phi)-\frac{\|\widetilde{D}_{n}\|^{1/2}}{\|D_{n}\|^{1/2}}S_{h,n}(\phi)\right)=\|D_{n}\|\mathrm{var}\left(\int_{D}\phi(\boldsymbol{\omega})\left(\|J_{h,n}(\boldsymbol{\omega})\|^{2}-\|\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega})\|^{2}\right)d\boldsymbol{\omega}\right)$
	$\displaystyle~{}=\|D_{n}\|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cov}\left(\|J_{h,n}(\boldsymbol{\omega}_{1})\|^{2}-\|\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega}_{1})\|^{2},\|J_{h,n}(\boldsymbol{\omega}_{2})\|^{2}-\|\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega}_{2})\|^{2}\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$
			(F.6)

From (F.2), we have

	$\displaystyle\|J_{h,n}(\boldsymbol{\omega})\|^{2}-\|\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega})\|^{2}$	$\displaystyle=$	$\displaystyle J_{h,n}(\boldsymbol{\omega})\left(J_{h,n}(-\boldsymbol{\omega})-\widetilde{J}_{h,n}^{1}(-\boldsymbol{\omega})\right)+\left(J_{h,n}(\boldsymbol{\omega})-\widetilde{J}_{h,n}^{1}(\boldsymbol{\omega})\right)\widetilde{J}_{h,n}^{1}(-\boldsymbol{\omega})$
		$\displaystyle=$	$\displaystyle J_{h,n}(\boldsymbol{\omega})\widetilde{J}_{h,n}^{2}(-\boldsymbol{\omega})+\widetilde{J}_{h,n}^{2}(\boldsymbol{\omega})\widetilde{J}_{h,n}^{1}(-\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.$

Therefore, the variance term (F.6) can be decomposed into four terms. We will only focus on the term

|D_{n}|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cov}\left(J_{h,n}(\boldsymbol{\omega}_{1})\widetilde{J}_{h,n}^{2}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2})\widetilde{J}_{h,n}^{2}(-\boldsymbol{\omega}_{2})\right)d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}

(F.7)

and other three terms can be treated similarly. Using indecomposable partition, the above term is $B_{1}+B_{2}+B_{3}$ , where

$\displaystyle B_{1}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))\mathrm{cum}(\widetilde{J}^{2}_{h,n}(-\boldsymbol{\omega}_{1}),\widetilde{J}^{2}_{h,n}(\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$
$\displaystyle B_{2}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),\widetilde{J}^{2}_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cum}(\widetilde{J}^{2}_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2},$
$\displaystyle B_{3}$	$\displaystyle=$	$\displaystyle\|D_{n}\|\int_{D^{2}}\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}),\widetilde{J}^{2}_{h,n}(-\boldsymbol{\omega}_{1}),\widetilde{J}^{2}_{h,n}(\boldsymbol{\omega}_{2}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}.$

By using a similar techniques to show Lemma D.5, we have

\left|\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(-\boldsymbol{\omega}_{2}),\widetilde{J}^{2}_{h,n}(-\boldsymbol{\omega}_{1}),\widetilde{J}^{2}_{h,n}(\boldsymbol{\omega}_{2}))\right|\leq C\frac{|E_{n}|}{|D_{n}|^{2}},\quad\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}.

Therefore,

|B_{3}|\leq C\frac{|E_{n}|}{|D_{n}|}\left(\int_{D}|\phi(\boldsymbol{\omega})|\right)^{2}=o(1),\quad n\rightarrow\infty.

Here, the second identity is due to (F.1). Now, let

h_{E_{n}}(\boldsymbol{x}/\boldsymbol{A})=h(\boldsymbol{x}/\boldsymbol{A})I(\boldsymbol{x}\in E_{n}),\quad n\in\mathbb{N},\quad\boldsymbol{x}\in D_{n}

be a truncated taper function. Then, we have

\frac{|E_{n}|^{1/2}}{|D_{n}|^{1/2}}\mathcal{J}^{2}_{h,n}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}}h_{E_{n}}(\boldsymbol{x}/\boldsymbol{A})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

Therefore, $\widetilde{J}^{2}_{h,n}(\boldsymbol{\omega})$ can be viewed as a (centered) DFT on $D_{n}$ with taper function $h_{E_{n}}$ . Thus, from the modification of the results of Theorem 4.1(ii), we have

|B_{1}|\leq C|D_{n}|^{-1}\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{2}h_{E_{n}}(\boldsymbol{x}/\boldsymbol{A})^{2}d\boldsymbol{x}=C|D_{n}|^{-1}\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{4}I(\boldsymbol{x}\in E_{n})d\boldsymbol{x}\leq C^{\prime}\frac{|E_{n}|}{|D_{n}|}.

Therefore, by using (F.1), we conclude $B_{1}=o(1)$ as $n\rightarrow\infty$ . Similarly, we can show $B_{2}=o(1)$ as $n\rightarrow\infty$ . All together, the term in (F.7) limits to zero as $n\rightarrow\infty$ . By using similar technique, we can bound other three terms in the decomposition in (F.6). Therefore, we have

\mathrm{var}\left(\widetilde{G}_{h,n}(\phi)-\frac{|\widetilde{D}_{n}|^{1/2}}{|D_{n}|^{1/2}}S_{h,n}(\phi)\right)=o(1),\quad n\rightarrow\infty.

Lastly, since $\mathrm{var}(\widetilde{G}_{h,n}(\phi))$ is finite, so does $\mathrm{var}(S_{h,n}(\phi))$ . Therefore, by using (F.1), we show $\lim_{n\rightarrow\infty}\mathrm{var}(\widetilde{G}_{h,n}(\phi)-S_{h,n}(\phi))=0$ . Thus, we get the desired results. $\Box$

Now, we study the asymptotic equivalence of $S_{h,n}(\phi)-V_{h,n}(\phi)$ . Recall (F.3) and (F.5). We have

$\displaystyle S_{h,n}(\phi)$	$\displaystyle=$	$\displaystyle\|\widetilde{D}_{n}\|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(\|J^{1}_{h,n}(\boldsymbol{\omega})\|^{2}-\mathbb{E}[\|J^{1}_{h,n}(\boldsymbol{\omega})\|^{2}]\right)d\boldsymbol{\omega}$
	$\displaystyle=$	$\displaystyle k_{n}^{-1/2}\sum_{\boldsymbol{j},\boldsymbol{k}\in A_{n}}\|D_{n}^{(\boldsymbol{k})}\|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{k})}_{h,n}(-\boldsymbol{\omega})-\mathbb{E}[J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{k})}_{h,n}(-\boldsymbol{\omega})]\right)d\boldsymbol{\omega}$
	$\displaystyle=$	$\displaystyle V_{h,n}(\phi)+T_{h,n}(\phi),$

where

T_{h,n}(\phi)=k_{n}^{-1/2}\sum_{\boldsymbol{j}\neq\boldsymbol{k}\in A_{n}}|D_{n}^{(\boldsymbol{k})}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{k})}_{h,n}(-\boldsymbol{\omega})-\mathbb{E}[J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{k})}_{h,n}(-\boldsymbol{\omega})]\right)d\boldsymbol{\omega}.

(F.8)

To show that $T_{h,n}(\phi)$ is asymptotically negligible, we require the covariance inequality below. For $\boldsymbol{j},\boldsymbol{k}\in n^{\gamma}\mathbb{Z}$ , let

k(\boldsymbol{j},\boldsymbol{k})=\|\boldsymbol{j}-\boldsymbol{k}\|_{\infty}-n^{\gamma}+n^{\beta}\in\{n^{\beta},n^{\beta}+n^{\gamma},n^{\beta}+2n^{\gamma},\dots\}.

Lemma F.1.

Suppose that Assumption 3.2 for $\ell=8$ holds. Let $\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell}\in A_{n}$ be the pairwise disjoint points and let $w_{n}=|D_{n}^{(\boldsymbol{k})}|$ be the common volume of the sub-blocks. Then, for $\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}$ ,

	$\displaystyle\left\|\mathrm{cov}(J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}_{h,n}(-\boldsymbol{\omega}_{1}),J^{(\boldsymbol{k})}_{h,n}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{\ell})}_{h,n}(-\boldsymbol{\omega}_{2}))\right\|\leq C\alpha_{w_{n},2w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\wedge k(\boldsymbol{j},\boldsymbol{\ell})\right)^{1/2},$		(F.9)
	$\displaystyle\left\|\mathrm{cov}(J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}_{h,n}(-\boldsymbol{\omega}_{1}),J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{k})}_{h,n}(-\boldsymbol{\omega}_{2}))\right\|\leq C\alpha_{w_{n},w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\right)^{1/2},$		(F.10)

where $\alpha_{p,q}(\cdot)$ is the $\alpha$ -mixing coefficient defined as in (3.3).

Proof. We first show (F.9). For the brevity, we write $J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega}_{1})=J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1}),\cdots$ . Note that $D_{n}^{(\boldsymbol{j})}$ and $D_{n}^{(\boldsymbol{k})}\cup D_{n}^{(\boldsymbol{\ell})}$ are disjoint and $d(D_{n}^{(\boldsymbol{j})},D_{n}^{(\boldsymbol{k})}\cup D_{n}^{(\boldsymbol{\ell})})=k(\boldsymbol{j},\boldsymbol{k})\wedge k(\boldsymbol{j},\boldsymbol{\ell})$ . Therefore, by using well-known covariance inequality (cf. Doukhan (1994), Theorem 3(1)), we have

	$\displaystyle\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1}),J^{(\boldsymbol{k})}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{\ell})}(-\boldsymbol{\omega}_{2}))\right\|$
	$\displaystyle~{}~{}\leq 8\alpha_{w_{n},2w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\wedge k(\boldsymbol{j},\boldsymbol{\ell})\right)^{1/2}\\|J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1})\\|_{\mathbb{E},4}\\|J^{(\boldsymbol{k})}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{\ell})}(-\boldsymbol{\omega}_{2}))\\|_{\mathbb{E},4},$

where $\|X\|_{\mathbb{E},4}=\{\mathbb{E}|X|^{4}\}^{1/4}$ . Then, by using $\|X\|_{\mathbb{E},4}^{4}=\kappa_{4}(X)+3\mathrm{var}(X)^{2}$ and Lemma D.5, we show $\|J^{(\boldsymbol{j})}(\boldsymbol{\omega})J^{(\boldsymbol{j})}(-\boldsymbol{\omega})\|_{\mathbb{E},4}=O(1)$ as $n\rightarrow\infty$ , provided Assumption 3.2 for $\ell=8$ holds. Substitute this to above, we show (F.9).

To show (F.10), we first note that for a centered random variable $X,Y,Z,W$ , $\mathrm{cov}(XY,ZW)=\mathrm{cov}(XY\overline{Z},W)-\mathrm{cov}(XY)\overline{\mathrm{cov}(ZW)}$ . Apply this identity to the left hand side of (F.10), we get

	$\displaystyle\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1}),J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{k})}(-\boldsymbol{\omega}_{2}))\right\|\leq\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2}),J^{(\boldsymbol{k})}(-\boldsymbol{\omega}_{2}))\right\|$
	$\displaystyle+\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1}))\right\|\cdot\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2}),J^{(\boldsymbol{k})}(\boldsymbol{\omega}_{2}))\right\|.$

By using similar arguments above, the second term above is bounded by $C\alpha_{w_{n},w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\right)^{1/2}$ . To bound for the first term, we first note that $J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1})=I^{(\boldsymbol{j})}(\boldsymbol{\omega})\in\mathbb{R}$ and by using Hölder’s inequality, we have

$\displaystyle\mathbb{E}\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2})\|^{8/3}$	$\displaystyle=$	$\displaystyle\mathbb{E}[I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})^{8/3}I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{2})\|J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2})\|^{2/3}]$
	$\displaystyle\leq$	$\displaystyle\\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{2})\|J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2})\|^{2/3}\\|_{\mathbb{E},3}\\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})^{8/3}\\|_{\mathbb{E},3/2}$
	$\displaystyle\leq$	$\displaystyle\\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{2})^{4}\\|_{\mathbb{E},1}^{1/3}\\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})^{4}\\|_{\mathbb{E},1}^{2/3}=O(1),\quad n\rightarrow\infty.$

Therefore, by using covariance inequality and the Hölder’s inequality, we have

	$\displaystyle\left\|\mathrm{cov}(J^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2}),J^{(\boldsymbol{k})}(-\boldsymbol{\omega}_{2}))\right\|$
	$\displaystyle~{}~{}\leq 8\alpha_{w_{n},w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\right)^{1/2}\left\\|I^{(\boldsymbol{j})}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}(-\boldsymbol{\omega}_{2})\right\\|_{\mathbb{E},8/3}\\|J^{(\boldsymbol{k})}(-\boldsymbol{\omega}_{2})\\|_{\mathbb{E},8}\leq C\alpha_{w_{n},w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\right)^{1/2}.$

Thus, we show (F.10). All together, we get the desired results. $\Box$

Now, we are ready to prove the theorem below.

Theorem F.2.

Suppose that Assumptions 3.1, 3.2 (for $\ell=8$ ), 3.3(ii), 4.1, and 4.2 hold. Suppose further the data taper $h$ is either constant on $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(for $m=d+1$ ). Then,

S_{h,n}(\phi)-V_{h,n}(\phi)=T_{h,n}(\phi)=o_{p}(1),\quad n\rightarrow\infty.

Proof. By using a similar argument as in the proof of Theorem F.1, it is enough to show $\lim_{n\rightarrow\infty}\mathrm{var}(T_{h,n}(\phi))=0$ . Note

\mathrm{var}(T_{h,n}(\phi))=\left(\mathrm{var}(S_{h,n}(\phi))-\mathrm{var}(V_{h,n}(\phi))\right)-2\mathrm{cov}(T_{h,n}(\phi),V_{h,n}(\phi)).

(F.11)

We will bound each term above. To bound the first term, by using Theorems 4.1(ii), F.1, and F.4 (below), we have

\lim_{n\rightarrow\infty}\mathrm{var}(S_{h,n}(\phi))=\lim_{n\rightarrow\infty}\mathrm{var}(V_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2}),

(F.12)

where $\Omega_{1}$ and $\Omega_{2}$ are defined as in (4.5). Therefore, the first term in (F.11) is $o(1)$ as $n\rightarrow\infty$ . To bound the second term, we will use a $\alpha$ -mixing condition. For $q\in\mathbb{N}$ , let $(A_{n})^{q,\neq}$ be a set of $q$ disjoints points in $A_{n}$ . Then,

	$\displaystyle\mathrm{cov}(T_{h,n}(\phi),V_{h,n}(\phi))=k_{n}^{-1}\sum_{\boldsymbol{j}\in A_{n}}\sum_{(\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{2,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{\ell})}(\phi))$
	$\displaystyle=k_{n}^{-1}\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{\ell})}(\phi))+k_{n}^{-1}\sum_{(\boldsymbol{j},\boldsymbol{k})\in(A_{n})^{2,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{j})}(\phi))$
	$\displaystyle~{}~{}+k_{n}^{-1}\sum_{(\boldsymbol{j},\boldsymbol{k})\in(A_{n})^{2,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{k})}(\phi)),$

where

\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{\ell})}(\phi)=|D_{n}^{(\boldsymbol{k})}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(J^{(\boldsymbol{k})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{\ell})}_{h,n}(-\boldsymbol{\omega})-\mathbb{E}[J^{(\boldsymbol{k})}_{h,n}(\boldsymbol{\omega})J^{(\boldsymbol{\ell})}_{h,n}(-\boldsymbol{\omega})]\right)d\boldsymbol{\omega},\quad\boldsymbol{k},\boldsymbol{\ell}\in A_{n}.

We will bound each term in the expansion of $\mathrm{cov}(T_{h,n}(\phi),V_{h,n}(\phi))$ . By using (F.9), the first term is bounded by

	$\displaystyle k_{n}^{-1}\left\|\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{\ell})}(\phi))\right\|$
	$\displaystyle~{}~{}\leq k_{n}^{-1}w_{n}\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\int_{D^{2}}\|\phi(\boldsymbol{\omega}_{1})\phi(\boldsymbol{\omega}_{2})\|\left\|\mathrm{cov}(J^{(\boldsymbol{j})}_{h,n}(\boldsymbol{\omega}_{1})J^{(\boldsymbol{j})}_{h,n}(-\boldsymbol{\omega}_{1}),J^{(\boldsymbol{k})}_{h,n}(\boldsymbol{\omega}_{2})J^{(\boldsymbol{\ell})}_{h,n}(-\boldsymbol{\omega}_{2}))\right\|d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}$
	$\displaystyle~{}~{}\leq Ck_{n}^{-1}w_{n}\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\alpha_{w_{n},2w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\wedge k(\boldsymbol{j},\boldsymbol{\ell})\right)^{1/2}.$

Let $n^{-\gamma}\min(\|\boldsymbol{j}-\boldsymbol{k}\|_{\infty},\|\boldsymbol{j}-\boldsymbol{\ell}\|_{\infty})=m\in\{1,2,\dots\}$ . Then, for fixed $\boldsymbol{j}\in A_{n}$ and $m\in\mathbb{N}$ , the number of disjoint pairs $(\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{2,\neq}$ that satisfies $n^{-\gamma}\min(\|\boldsymbol{j}-\boldsymbol{k}\|_{\infty},\|\boldsymbol{j}-\boldsymbol{\ell}\|_{\infty})=m$ is upper bounded by $Ck_{n}m^{d-1}$ for some constant $C>0$ . Therefore, by using Assumption 3.3(ii) the right hand side above is bounded by

	$\displaystyle C_{1}k_{n}^{-1}w_{n}\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\alpha_{w_{n},2w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\wedge k(\boldsymbol{j},\boldsymbol{\ell})\right)^{1/2}$
	$\displaystyle~{}~{}\leq C_{2}w_{n}\sum_{\boldsymbol{j}\in A_{n}}\sum_{m=1}^{\infty}m^{d-1}\cdot w_{n}^{1/2}(n^{\gamma}(m-1)+n^{\beta})^{-(d+\varepsilon)/2}$
	$\displaystyle~{}~{}\leq C_{3}k_{n}w_{n}^{3/2}\left(n^{-\beta(d+\varepsilon)/2}+\sum_{m=1}^{\infty}m^{d-1}(n^{\gamma}m)^{-(d+\varepsilon)/2}\right)$
	$\displaystyle~{}~{}\leq C_{4}n^{d}n^{\gamma d/2}\left(n^{-\beta(d+\varepsilon)/2}+n^{-\gamma(d+\varepsilon)/2}\right)=o(1),\quad n\rightarrow\infty.$

Here, we use Assumption 3.3(ii) on the first inequality, $(m+1)^{d-1}\leq 2m^{d-1}$ and $n^{\gamma}(m-1)+n^{\beta}>n^{\gamma}(m-1)$ on the second inequality, $k_{n}w_{n}\leq|D_{n}|\leq Cn^{d}$ , $w_{n}^{1/2}\leq n^{\gamma d/2}$ , and $\sum_{m=1}^{\infty}m^{d-1-(d+\varepsilon)/2}<\infty$ on the third inequality, and $\beta,\gamma>2d/\varepsilon$ on the identity. Thus, we have

k_{n}^{-1}\sum_{(\boldsymbol{j},\boldsymbol{k},\boldsymbol{\ell})\in(A_{n})^{3,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{\ell})}(\phi))=o(1),\quad n\rightarrow\infty.

(F.13)

To bound the second term, we use (F.10) and Assumption 3.3(ii) and get

	$\displaystyle k_{n}^{-1}\left\|\sum_{(\boldsymbol{j},\boldsymbol{k})\in(A_{n})^{2,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{k},\boldsymbol{j})}(\phi))\right\|\leq C_{1}k_{n}^{-1}w_{n}\sum_{(\boldsymbol{j},\boldsymbol{k})\in(A_{n})^{2,\neq}}\alpha_{w_{n},w_{n}}\left(k(\boldsymbol{j},\boldsymbol{k})\right)^{1/2}$
	$\displaystyle\quad\leq C_{2}w_{n}^{3/2}\sum_{m=1}^{\infty}m^{d-1}(n^{\gamma}(m-1)+n^{\beta})^{-(d+\varepsilon)/2}\leq C_{3}n^{3\gamma d/2}n^{-\gamma(d+\varepsilon)/2}==o(1),\quad n\rightarrow\infty.$

Similarly, the third term is bounded by

k_{n}^{-1}\sum_{(\boldsymbol{j},\boldsymbol{k})\in(A_{n})^{2,\neq}}\mathrm{cov}(\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{j})}(\phi),\widetilde{G}_{h,n}^{(\boldsymbol{j},\boldsymbol{k})}(\phi))=o(1),\quad n\rightarrow\infty.

Substitute these results into the expansion of $\mathrm{cov}(T_{h,n}(\phi),V_{h,n}(\phi))$ , we have

\lim_{n\rightarrow\infty}\mathrm{cov}(T_{h,n}(\phi),V_{h,n}(\phi))=0.

(F.14)

Substituting (F.12) and (F.14) into (F.11), we show $\lim_{n\rightarrow\infty}\mathrm{var}(T_{h,n}(\phi))=0$ as $n\rightarrow\infty$ . Thus, we prove the theorem. $\Box$

Below theorem is immediately follows from Theorems F.1 and F.2.

Theorem F.3.

\widetilde{G}_{h,n}(\phi)-V_{h,n}(\phi)=o_{p}(1),\quad n\rightarrow\infty.

F.2 CLT for $V_{h,n}(\phi)$

Recall $V_{h,n}(\phi)=k_{n}^{-1/2}\sum_{\boldsymbol{k}\in A_{n}}\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)$ , where

\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)=|D_{n}^{(\boldsymbol{k})}|^{1/2}\int_{D}\phi(\boldsymbol{\omega})\left(|J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})|^{2}-\mathbb{E}[|J_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})|^{2}]\right)d\boldsymbol{\omega},\quad n\in\mathbb{N},\quad\boldsymbol{k}\in A_{n}.

In this section, we prove the CLT for $V_{h,n}(\phi)$ . First, we calculate the asymptotic variance of $V_{h,n}(\phi)$ .

Theorem F.4.

Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), 3.3(ii), 4.1, and 4.2 hold. Suppose further the data taper $h$ is either constant on $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(for $m=d+1$ ). Then,

\lim_{n\rightarrow\infty}\mathrm{var}(V_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2}),

where $\Omega_{1}$ and $\Omega_{2}$ are defined as in (4.5).

Proof. For $\boldsymbol{k}\in A_{n}$ , let $\widehat{G}_{h,n}^{(\boldsymbol{k})}(\phi)$ be an independent copy of $\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)$ and let

\widehat{V}_{h,n}(\phi)=k_{n}^{-1/2}\sum_{\boldsymbol{k}\in A_{n}}\widehat{G}_{h,n}^{(\boldsymbol{k})}(\phi).

(F.15)

Then, by using a standard telescoping sum argument (cf. Pawlas (2009)), one can easily shown that $V_{h,n}(\phi)-\widehat{V}_{h,n}(\phi)=o_{p}(1)$ as $n\rightarrow\infty$ . Therefore,

\lim_{n\rightarrow\infty}\mathrm{var}(V_{h,n}(\phi))=\lim_{n\rightarrow\infty}\mathrm{var}(\widehat{V}_{h,n}(\phi))=\lim_{n\rightarrow\infty}k_{n}^{-1}\sum_{\boldsymbol{k}\in A_{n}}\mathrm{var}(\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)).

Now, we focus on the variance of $\widehat{G}_{h,n}^{(\boldsymbol{k})}(\phi)$ . Since $D_{n}^{(\boldsymbol{k})}$ also has a rectangle form that satisfies Assumption 3.1, from the modification of the proof of the asymptotic variance of $|D_{n}|^{1/2}A_{h,n}(\phi)$ in Theorem 4.1(ii), one can show that

\displaystyle\mathrm{var}(\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi))\approx(2\pi)^{d}|D_{n}^{(\boldsymbol{k})}|^{-1}\left(\frac{\int_{D_{n}^{(\boldsymbol{k})}}\{h(\boldsymbol{x}/\boldsymbol{A})\}^{4}d\boldsymbol{x}}{H_{h,2}^{2}}\right)(\Omega_{1}+\Omega_{2}),\quad\boldsymbol{k}\in A_{n}.

Therefore, we have

	$\displaystyle\lim_{n\rightarrow\infty}k_{n}^{-1}\sum_{\boldsymbol{k}\in A_{n}}\mathrm{var}(\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi))$	$\displaystyle=$	$\displaystyle(2\pi)^{d}\lim_{n\rightarrow\infty}k_{n}^{-1}\|D_{n}^{(\boldsymbol{k})}\|^{-1}H_{h,2}^{-2}\left(\int_{\widetilde{D}_{n}}\{h(\boldsymbol{x}/\boldsymbol{A})\}^{4}d\boldsymbol{x}\right)(\Omega_{1}+\Omega_{2})$
		$\displaystyle=$	$\displaystyle(2\pi)^{d}\frac{H_{h,4}}{H_{h,2}^{2}}(\Omega_{1}+\Omega_{2}).$

Here, the last identity is due to (F.1), $k_{n}|D_{n}^{(\boldsymbol{k})}|=|\widetilde{D}_{n}|$ , and $\int_{D_{n}}h(\boldsymbol{x}/\boldsymbol{A})^{4}d\boldsymbol{x}=|D_{n}|H_{h,4}$ . Thus, we get the desired result. $\Box$

Now, we are ready to prove the CLT for $V_{h,n}(\phi)$ .

Theorem F.5.

Suppose that Assumptions 3.1, 3.2 (for $\ell=8$ ), 3.3(ii), 4.1, and 4.2 hold. Suppose further the data taper $h$ is either constant on $[-1/2,1/2]^{d}$ or satisfies Assumption 3.4(for $m=d+1$ ).

V_{h,n}(\phi)\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}\mathcal{N}\left(0,(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2})\right),\quad n\rightarrow\infty.

Proof. Recall $\widehat{V}_{h,n}(\phi)$ from (F.15) has the same asymptotic distribution with $V_{h,n}(\phi)$ . To show the CLT for $\widehat{V}_{h,n}(\phi)$ , we only need to check the Lyapunov condition: for some $\delta>0$ ,

\lim_{n\rightarrow\infty}k_{n}^{-(2+\delta)/2}\sum_{\boldsymbol{k}\in A_{n}}\mathbb{E}[|\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)|^{2+\delta}]=0.

(F.16)

We will show the above for $\delta=2$ , provided Assumption 3.2 for $\ell=8$ . To show (F.16), it is enough to show

\sup_{n\in\mathbb{N}}\mathbb{E}[|\widetilde{G}_{h,n}(\phi)|^{4}]<\infty.

(F.17)

This is because, once we show (F.17), one can show

k_{n}^{-2}\sum_{\boldsymbol{k}\in A_{n}}\mathbb{E}[|\widetilde{G}_{h,n}^{(\boldsymbol{k})}(\phi)|^{4}]\leq Ck_{n}^{-1}\rightarrow 0,\quad n\rightarrow\infty.

Thus, (F.16) holds for $\delta=2$ . Using (B.5), we have

\sup_{n\in\mathbb{N}}\mathbb{E}[|\widetilde{G}_{h,n}(\phi)|^{4}]\leq\sup_{n\in\mathbb{N}}\kappa_{4}(\widetilde{G}_{h,n}(\phi))+3\left(\sup_{n\in\mathbb{N}}\mathrm{var}(\widetilde{G}_{h,n}(\phi))\right)^{2}.

From Theorem 4.1(ii), we have

\lim_{n\rightarrow\infty}\mathrm{var}(\widetilde{G}_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2}).

(F.18)

Therefore, $\sup_{n\in\mathbb{N}}\mathrm{var}(\widetilde{G}_{h,n}(\phi))<\infty$ . To bound the fourth-order cumulant term, we note that

	$\displaystyle\kappa_{4}(\widetilde{G}_{h,n}(\phi))$	$\displaystyle=$	$\displaystyle\|D_{n}\|^{2}\kappa_{4}\left(\int_{D}\phi(\boldsymbol{\omega})I_{h,n}(\boldsymbol{\omega})\right)=\|D_{n}\|^{2}\int_{D^{4}}\left(\prod_{i=1}^{4}\phi(\boldsymbol{\omega}_{i})\right)$		(F.19)
			$\displaystyle~{}~{}\times\mathrm{cum}(I_{h,n}(\boldsymbol{\omega}_{1}),I_{h,n}(\boldsymbol{\omega}_{2}),I_{h,n}(\boldsymbol{\omega}_{3}),I_{h,n}(\boldsymbol{\omega}_{4}))d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{\omega}_{3}d\boldsymbol{\omega}_{4}.$		(F.19)

Now, we will evaluate the cumulant term above. Note that

	$\displaystyle\mathrm{cum}(I_{h,n}(\boldsymbol{\omega}_{1}),I_{h,n}(\boldsymbol{\omega}_{2}),I_{h,n}(\boldsymbol{\omega}_{3}),I_{h,n}(\boldsymbol{\omega}_{4}))$
	$\displaystyle~{}=\mathrm{cum}\bigg{(}J_{h,n}(\boldsymbol{\omega}_{1})J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2})J_{h,n}(-\boldsymbol{\omega}_{2}),J_{h,n}(\boldsymbol{\omega}_{3})J_{h,n}(-\boldsymbol{\omega}_{3}),J_{h,n}(\boldsymbol{\omega}_{4})J_{h,n}(-\boldsymbol{\omega}_{4})\bigg{)}.$

Using indecomposable partitions, the above joint cumulant can be written as sum of product of cumulants of form $\mathrm{cum}(J_{h,n}(\pm\boldsymbol{\omega}_{i_{1}}),\dots,J_{h,n}(\pm\boldsymbol{\omega}_{i_{k}}))$ , where $k\in\{2,\dots,8\}$ and $i_{j}\in\{1,2,3,4\}$ for $j\in\{1,\dots,k\}$ . By using an argument in Lemma D.5, the leading term (which has the largest order) is a product of four joint cumulants of order two. An example of such term is

		$\displaystyle\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))\mathrm{cum}(J_{h,n}(-\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{3}))$		(F.20)
		$\displaystyle~{}~{}\times\mathrm{cum}(J_{h,n}(-\boldsymbol{\omega}_{2}),J_{h,n}(-\boldsymbol{\omega}_{4}))\mathrm{cum}(J_{h,n}(-\boldsymbol{\omega}_{3}),J_{h,n}(\boldsymbol{\omega}_{4})).$		(F.20)

Now, we will bound one of the terms in (F.19) that is associated with the above cumulant products. Let $\phi(\boldsymbol{\omega})=0$ outside the domain $D$ . By using Lemma D.1, we have

	$\displaystyle\mathrm{cum}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle=\|D_{n}\|^{-1}H_{h,2}^{-1}f(\boldsymbol{\omega}_{1})H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})+C\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}),$		(F.21)

where $H_{h,2}^{(n)}(\cdot)$ and $R_{h,h}^{(n)}(\cdot,\cdot)$ are defined as in (2.7) and (C.1). Therefore, the integral term in the decomposition of (F.19) that is associated with (F.20) has 16 terms. We will only bound the two representative terms. Other 14 terms will be bounded in the similar way. The first representative term is

	$\displaystyle H_{h,2}^{-4}\|D_{n}\|^{-2}\int_{\mathbb{R}^{4d}}\prod_{i=1}^{4}\phi(\boldsymbol{\omega}_{i})\times f(\boldsymbol{\omega}_{1})f(-\boldsymbol{\omega}_{1})f(-\boldsymbol{\omega}_{2})f(\boldsymbol{\omega}_{3})$
	$\displaystyle~{}~{}\times H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})H_{h,2}^{(n)}(-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{3})H_{h,2}^{(n)}(-\boldsymbol{\omega}_{2}-\boldsymbol{\omega}_{4})H_{h,2}^{(n)}(-\boldsymbol{\omega}_{3}+\boldsymbol{\omega}_{4})d\boldsymbol{\omega}_{1}d\boldsymbol{\omega}_{2}d\boldsymbol{\omega}_{3}d\boldsymbol{\omega}_{4}$
	$\displaystyle=C\int_{\mathbb{R}^{4d}}\psi(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{t}_{3},\boldsymbol{t}_{4})\times c_{h^{2},n}(\boldsymbol{t}_{1})c_{h^{2},n}(\boldsymbol{t}_{2})c_{h^{2},n}(\boldsymbol{t}_{3})c_{h^{2},n}(-(\boldsymbol{t}_{1}+\boldsymbol{t}_{2}+\boldsymbol{t}_{3}))d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4},$

where for $\boldsymbol{t}_{1},\dots,\boldsymbol{t}_{4}\in\mathbb{R}^{d}$ .

	$\displaystyle\psi(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{t}_{3},\boldsymbol{t}_{4})$	$\displaystyle=$	$\displaystyle\alpha\times\phi(\boldsymbol{t}_{1}+\boldsymbol{t}_{3}+\boldsymbol{t}_{4})\phi(-\boldsymbol{t}_{3}-\boldsymbol{t}_{4})\phi(\boldsymbol{t}_{1}+\boldsymbol{t}_{2}+\boldsymbol{t}_{3}+\boldsymbol{t}_{4})\phi(\boldsymbol{t}_{4})$
			$\displaystyle\quad\times f(\boldsymbol{t}_{1}+\boldsymbol{t}_{3}+\boldsymbol{t}_{4})f(-\boldsymbol{t}_{1}-\boldsymbol{t}_{3}-\boldsymbol{t}_{4})f(-\boldsymbol{t}_{3}-\boldsymbol{t}_{4})f(\boldsymbol{t}_{1}+\boldsymbol{t}_{2}+\boldsymbol{t}_{3}+\boldsymbol{t}_{4}).$

Here, we use change of variables $\boldsymbol{t}_{1}=\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}$ , $\boldsymbol{t}_{2}=-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{3}$ , $\boldsymbol{t}_{3}=-\boldsymbol{\omega}_{2}-\boldsymbol{\omega}_{4}$ , and $\boldsymbol{t}_{4}=\boldsymbol{\omega}_{4}$ in the identity above and $\alpha$ in $\psi(\cdot,\cdot,\cdot,\cdot)$ is the Jacobian determinant. Since $\phi(\cdot)$ is bounded and has a compact support and $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}f(\boldsymbol{\omega})<\infty$ , it is easily seen that $|\psi(\cdot,\cdot,\cdot,\cdot)|$ is also bounded and has a compact support. Then, by iteratively applying Lemma C.3(c),(d), and (e), we have

\left|\int_{\mathbb{R}^{4d}}\psi(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{t}_{3},\boldsymbol{t}_{4})c_{h^{2},n}(\boldsymbol{t}_{1})c_{h^{2},n}(\boldsymbol{t}_{2})c_{h^{2},n}(\boldsymbol{t}_{3})c_{h^{2},n}(-(\boldsymbol{t}_{1}+\boldsymbol{t}_{2}+\boldsymbol{t}_{3}))d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}\right|=O(|D_{n}|^{-1/2})

as $n\rightarrow\infty$ .

The second representative term is

	$\displaystyle C^{4}\|D_{n}\|^{-2}\int_{\mathbb{R}^{4d}}\prod_{i=1}^{4}\phi(\boldsymbol{\omega}_{i})\int_{\mathbb{R}^{4d}}\prod_{i=1}^{4}C(\boldsymbol{u}_{i})\times e^{-i(\boldsymbol{u}_{1}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{u}_{2}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{u}_{3}^{\top}\boldsymbol{\omega}_{2}+\boldsymbol{u}_{4}^{\top}\boldsymbol{\omega}_{3})}$
	$\displaystyle~{}~{}\times R_{h,h}^{(n)}(\boldsymbol{u}_{1},\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2})R_{h,h}^{(n)}(\boldsymbol{u}_{2},-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{3})R_{h,h}^{(n)}(\boldsymbol{u}_{3},-\boldsymbol{\omega}_{2}-\boldsymbol{\omega}_{4})R_{h,h}^{(n)}(\boldsymbol{u}_{4},-\boldsymbol{\omega}_{3}+\boldsymbol{\omega}_{4}).$

To bound the above term, we require a sharp bound for $R_{h,h}^{(n)}$ . Let $\{h_{\boldsymbol{j}}\}_{\boldsymbol{j}\in\mathbb{Z}^{d}}$ be the Fourier coefficients of $h$ that satisfies (C.8). Then, by using Theorem C.2(ii) together with an inequality $\rho(|x|)\leq 1$ and change of variables $\boldsymbol{t}_{1}=\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{2}$ , $\boldsymbol{t}_{2}=-\boldsymbol{\omega}_{1}+\boldsymbol{\omega}_{3}$ , $\boldsymbol{t}_{3}=-\boldsymbol{\omega}_{2}-\boldsymbol{\omega}_{4}$ , and $\boldsymbol{t}_{4}=\boldsymbol{\omega}_{4}$ , the above is bounded by

	$\displaystyle C\sum_{p_{1},\cdots,p_{4}=0}^{m_{d}}\sum_{\boldsymbol{j}_{1},\dots,\boldsymbol{j}_{4},\boldsymbol{k}_{1},\dots,\boldsymbol{k}_{4}\in\mathbb{Z}^{d}}\left(\prod_{i=1}^{4}\|h_{\boldsymbol{j}_{i}}h_{\boldsymbol{k}_{i}}\|\right)\int_{\mathbb{R}^{4d}}d\boldsymbol{u}_{1}d\boldsymbol{u}_{2}d\boldsymbol{u}_{3}d\boldsymbol{u}_{4}\prod_{i=1}^{4}\|C(\boldsymbol{u}_{i})\|$
	$\displaystyle~{}~{}\times\int_{\mathbb{R}^{4d}}\|\widetilde{\psi}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{t}_{3},\boldsymbol{t}_{4})\|\times\bigg{\|}c_{D_{n,p_{1}}(\boldsymbol{u}_{1})}(\boldsymbol{t}_{1}-2\pi(\boldsymbol{j}_{1}+\boldsymbol{k}_{1})/\boldsymbol{A})$
	$\displaystyle~{}~{}\times c_{D_{n,p_{2}}(\boldsymbol{u}_{2})}(\boldsymbol{t}_{2}-2\pi(\boldsymbol{j}_{2}+\boldsymbol{k}_{2})/\boldsymbol{A})c_{D_{n,p_{3}}(\boldsymbol{u}_{3})}(\boldsymbol{t}_{3}-2\pi(\boldsymbol{j}_{3}+\boldsymbol{k}_{3})/\boldsymbol{A})$
	$\displaystyle~{}~{}\times c_{D_{n,p_{4}}(\boldsymbol{u}_{4})}(-(\boldsymbol{t}_{1}+\boldsymbol{t}_{2}+\boldsymbol{\omega}_{3})+\boldsymbol{\omega}_{4}-2\pi(\boldsymbol{j}_{4}+\boldsymbol{k}_{4})/\boldsymbol{A})\bigg{\|}d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4},$

where $\widetilde{\psi}(\boldsymbol{t}_{1},\boldsymbol{t}_{2},\boldsymbol{t}_{3},\boldsymbol{t}_{4})$ is a bounded function with compact support.

By using Lemma C.3(d),(e) the term in integral with respect to $d\boldsymbol{t}_{1}d\boldsymbol{t}_{2}d\boldsymbol{t}_{3}d\boldsymbol{t}_{4}$ is uniformly bounded above. With this observation together with $\sum_{j\in\mathbb{Z}^{d}}|h_{\boldsymbol{j}}|$ and $C(\boldsymbol{u})\in L^{1}(\mathbb{R}^{d})$ , the above term is $O(1)$ as $n\rightarrow\infty$ . Therefore, we conclude that the decomposition of (F.19) associated with (F.20) is $O(1)$ as $n\rightarrow\infty$ .

Similarly, all other terms in the indecomposable partition are $O(1)$ as $n\rightarrow\infty$ . Thus, $\sup_{n\in\mathbb{N}}|\kappa_{4}(\widetilde{G}_{h,n}(\phi))|<\infty$ and this shows (F.17).

Lastly, once we verify (F.17), the Lyapunov condition in (F.16) is also true, thus, combining this with Theorem F.4, we obtain the desired results. $\Box$

Appendix G Estimation of the asymptotic variance

Recall the integrated periodgram $\widehat{A}_{h,n}(\phi)$ in (4.1). By Theorem 4.1, the $(1-\alpha)$ ( $\alpha\in(0,1)$ ) confidence interval of the spectral mean $A(\phi)$ of form (4.1) is

\widehat{A}_{h,n}(\phi)\pm\frac{z_{1-\alpha/2}}{|D_{n}|^{1/2}}(2\pi)^{d/2}(H_{h,4}^{1/2}/H_{h,2})\sqrt{\Omega_{1}+\Omega_{2}},

where $z_{1-\alpha/2}$ is the $(1-\alpha/2)$ -th quantile of the standard normal random variable and $\Omega_{1}$ and $\Omega_{2}$ are defined as in (4.5). The quantity $\Omega_{1}+\Omega_{2}$ is in terms of the unknown spectral density function and complete fourth-order spectral density function. In this section, we sketch the procedure to estimate the asymptotic variance $\lim_{n\rightarrow\infty}|D_{n}|\mathrm{var}(\widehat{A}_{h,n}(\phi))=(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2})$ by the mean of subsampling.

To ease the presentation, we assume that $|D_{n}|$ grows proportional to $n^{d}$ as $n\rightarrow\infty$ . Thus the side lengthes $A_{1},\cdots,A_{d}$ increases proportional to order of $n$ . For $n\in\mathbb{N}$ , let $0<a_{1}<a_{2}<\dots$ be a sequence of increasing numbers such that $a_{n}=o(n)$ as $n\rightarrow\infty$ . Therefore, we have $\lim_{n\rightarrow\infty}a_{n}/A_{i}(n)=0$ for any $i\in\{1,\dots,d\}$ . Now, let

T_{n}=\{\boldsymbol{k}:\boldsymbol{k}\in\mathbb{Z}^{d}~{}~{}\text{and}~{}~{}B_{n}^{(\boldsymbol{k})}=\boldsymbol{k}+[-a_{n}/2,a_{n}/2]^{d}\subset D_{n}\},\quad n\in\mathbb{N}.

Unlike the subrectangle $D_{n}^{(\boldsymbol{k})}$ in Appendix F, $\{B_{n}^{(\boldsymbol{k})}\}_{\boldsymbol{k}\in T_{n}}$ are the subrectangles of $D_{n}$ that can be overlapped. For $n\in\mathbb{N}$ and $\boldsymbol{k}\in T_{n}$ , we define subsampling analogous of the DFT by

\mathbb{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|B_{n}^{(\boldsymbol{k})}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap B_{n}^{(\boldsymbol{k})}}h(a_{n}^{-1}(\boldsymbol{x}-\boldsymbol{k}))\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

Note that the above definition is slightly different from $\mathcal{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})$ of (F.4) since $\mathbb{J}_{h,n}^{(\boldsymbol{k})}(\cdot)$ alters the data taper function for each subretangle. By simple calculation, we have $\mathbb{E}[\mathbb{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})]=\lambda c_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})$ , where for $n\in\mathbb{N}$ , $\boldsymbol{k}\in T_{n}$ , and $\boldsymbol{\omega}\in\mathbb{R}^{d}$ ,

c_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|B_{n}^{(\boldsymbol{k})}|^{-1/2}\exp(i\boldsymbol{k}^{\top}\boldsymbol{\omega})\int_{[-a_{n}/2,a_{n}/2]^{d}}h(a_{n}^{-1}\boldsymbol{x})\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega})d\boldsymbol{x}.

Therefore, the subsample version of the integrated periodogram is given by

\widehat{A}_{h,n}^{(\boldsymbol{k})}(\phi)=\int_{D}\phi(\boldsymbol{\omega})\left|\mathbb{J}_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})-\widehat{\lambda}_{h,n}c_{h,n}^{(\boldsymbol{k})}(\boldsymbol{\omega})\right|^{2}d\boldsymbol{\omega},\quad n\in\mathbb{N},\quad\boldsymbol{k}\in T_{n},

where $\widehat{\lambda}_{h,n}$ is an unbiased tapered estimator of the first-order intensity function. In practice, one can approximate $\widehat{A}_{h,n}^{(\boldsymbol{k})}(\phi)$ using Riemann sum as outlined in Section 7.1. Now, our subsampling estimator of the asymptotic variance of $|D_{n}|\mathrm{var}(\widehat{A}_{h,n}(\phi))$ is

\zeta_{n}=\frac{\alpha_{n}^{d}}{|T_{n}|}\sum_{\boldsymbol{k}\in T_{n}}\left\{\widehat{A}_{h,n}^{(\boldsymbol{k})}(\phi)-|T_{n}|^{-1}\sum_{\boldsymbol{j}\in T_{n}}\widehat{A}_{h,n}^{(\boldsymbol{k})}(\phi)\right\}^{2},\quad n\in\mathbb{N},

where $\alpha_{n}^{d}$ is the common volume of $B_{n}^{(\boldsymbol{k})}$ . Under appropriate moment and mixing conditions such as conditions $(\mathcal{S}1)$ – $(\mathcal{S}6$ ) in Biscio and Waagepetersen (2019) (page 1174), one may expect that $\zeta_{n}$ is a consistent estimator of the asymptotic variance $(2\pi)^{d}(H_{h,4}/H_{h,2}^{2})(\Omega_{1}+\Omega_{2})$ . A rigourous proof of the sampling properties of $\zeta_{n}$ will be reported in future research.

Appendix H Additional simulation results

H.1 Computation and illustration of the kernel spectral density estimator

In this section, we describe the computation of the kernel spectral density estimator and provide illustrations. Recall $\widehat{f}_{n,b}(\boldsymbol{\omega})$ in (3.8). Since $\widehat{f}_{n,b}(\boldsymbol{\omega})$ has an integral form, Riemann sum approximation of $\widehat{f}_{n,b}(\boldsymbol{\omega})$ is

\displaystyle\widehat{f}_{n,b}^{(R)}(\boldsymbol{\omega})

\displaystyle=

\displaystyle\frac{\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{\omega}_{\boldsymbol{k},A})\widehat{I}_{h,n}(\boldsymbol{\omega}_{\boldsymbol{k},A})}{\sum_{\boldsymbol{k}\in\mathbb{Z}^{d}}W_{b}(\boldsymbol{\omega}-\boldsymbol{\omega}_{\boldsymbol{k},A})},\quad\boldsymbol{\omega}\in\mathbb{R}^{d},

(H.1)

where $A>0$ is the side length of the observation domain $D_{n}=[-A/2,A/2]^{d}$ and $\boldsymbol{\omega}_{\boldsymbol{k},A}=2\pi\boldsymbol{k}/A$ for $\boldsymbol{k}\in\mathbb{Z}^{d}$ . The summation above is a finite sum due to the fact that $W_{b}(\cdot)$ has a support $[-b/2,b/2]^{d}$ . For a selection of the kernel function, we choose a triangular kernel $W(\boldsymbol{x})=W(x_{1})W(x_{2})$ for $\boldsymbol{x}=(x_{1},x_{2})^{\top}\in\mathbb{R}^{2}$ where $W(x)=2\max\{1-2|x|,0\}$ . The bandwidth $b\in(0,\infty)$ is set at $b=|D_{n}|^{-1/6}$ which is an optimal rate in the sense of mean-squared error criterion (see Ding et al. (2024), Section 6.1 for details).

In top panels of Figure H.1 below, we calculate $\widehat{f}_{n,b}^{(R)}(\boldsymbol{\omega})$ of each periodogram that are computed in the bottom panels of Figure 1. In the middle panels, we evaluate the absolute biases of $\widehat{I}_{h,n}$ and in the bottom panels, we evaluate the absolute biases of $\widehat{f}_{n,b}^{(R)}$ .

For all models, the absolute bias of the periodograms (middle panels) have similar patterns to those of the corresponding spectral density functions. This can be explained by using Theorem 3.1 that $|\widehat{I}_{h,n}(\boldsymbol{\omega})-f(\boldsymbol{\omega})|\approx\mathrm{var}(\widehat{I}_{h,n}(\boldsymbol{\omega}))^{1/2}=f(\boldsymbol{\omega})$ for $\boldsymbol{\omega}\in\mathbb{R}^{d}\backslash\{\textbf{0}\}$ . Therefore, the middle panels indicate that the periodogram is inconsistent. However, the absolute bias of the smoothed periodograms (bottom panels) are nearly zero across all frequencies and all models. This solidifies the thoeretical results in Thoerem 3.3 which states that $\widehat{f}_{n,b}(\boldsymbol{\omega})$ is a consistent estimator of $f(\boldsymbol{\omega})$ for all $\boldsymbol{\omega}\in\mathbb{R}^{d}$ .

H.2 Computation of the periodograms

In this section, we discuss an implementation of computing periodograms to evaluate the discretized Whittle likelihood in (7.2). For the simplicity, we assume $d=2$ . The cases when $d=1$ or $d\in\{3,4,\dots\}$ can be treated similarly.

Let the observation window be $D_{n}=[-A_{1}/2,A_{1}/2]\times[-A_{2}/2,A_{2}/2]$ for $A_{1},A_{2}\in(0,\infty)$ . Suppose that the prespecified domain $D\subset\mathbb{R}^{2}$ has a rectangle form centered at the origin, thus the gridded version of $D$ , denotes $D_{\text{grid}}=\{2\pi\boldsymbol{k}/\Omega:\boldsymbol{k}\in\mathbb{Z}^{2},2\pi\boldsymbol{k}/\Omega\in D\}$ , also forms a rectangular grid. Let this retangular grid can be written as $D_{\text{grid}}=\{(2\pi k_{1}/\Omega,2\pi k_{2}/\Omega):|k_{i}|\leq a_{i},~{}~{}k_{i}\in\mathbb{Z}\}$ for some $a_{1},a_{2}\in\mathbb{N}$ . Suppose further that data taper function is separable, i.e., $h(\boldsymbol{x})=h_{1}(x_{1})h_{2}(x_{2})$ for some $h_{1},h_{2}$ and let

u_{i}(\omega,A)=H_{i}^{(n)}(\omega)=\int_{-A/2}^{A/2}h_{i}(x/A)\exp(-ix\omega)dx,\quad i\in\{1,2\}.

(H.2)

We will assume that $u_{i}(\omega,A)$ has a closed form expression, thus, there is no additional computational burden to approximate the integral in $u_{i}$ .

Now, we will discuss an efficient way to compute $\{\widehat{I}_{h,n}(\boldsymbol{\omega}):\boldsymbol{\omega}\in D_{\text{grid}}\}$ based on the observed point pattern $\{\boldsymbol{x}_{j}=(x_{j,1},x_{j,2})^{\top}:1\leq j\leq m\}$ in $D_{n}$ . From its definition, $\widehat{I}_{h,n}(\boldsymbol{\omega})=|\mathcal{J}_{h,n}(\boldsymbol{\omega})-\widehat{\lambda}_{h,n}c_{h,n}(\boldsymbol{\omega})|^{2}$ where $\mathcal{J}_{h,n}(\boldsymbol{\omega})$ and $c_{h,n}(\boldsymbol{\omega})$ are as in (2.8) and (2.10), respectively. Therefore, we will compute $\mathcal{J}_{h,n}(\boldsymbol{\omega})$ and $c_{h,n}(\boldsymbol{\omega})$ separately. Since $h$ is separable, the tapered DFT can be written as $\mathcal{J}_{h,n}(2\pi\boldsymbol{k}/\Omega)=C\sum_{j=1}^{m}v_{1}(x_{j,1},k_{1})v_{2}(x_{j,2},k_{2})$ , where $C=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}$ and $v_{i}(x,k)=h_{j}(x/A_{i})\exp(-ix(2\pi k/\Omega))$ , $i\in\{1,2\}$ . Then, the matrix form of $\{\mathcal{J}_{h,n}(\boldsymbol{\omega}):\boldsymbol{\omega}\in D_{\text{grid}}\}$ is equal to $CV_{1}^{\top}V_{2}$ , where for $i\in\{1,2\}$ ,

V_{i}=[\boldsymbol{v}_{i}(-a_{i})|\boldsymbol{v}_{i}(-a_{i}+1)|\cdots|\boldsymbol{v}_{i}(a_{i})],\quad\text{where}\quad\boldsymbol{v}_{i}(k)=(v_{i}(x_{i,1},k),\dots,v_{i}(x_{i,m},k))^{\top}.

Next, we calculate $c_{h,n}(2\pi\boldsymbol{k}/\Omega)$ . Again using separability of $h$ , we have

c_{h,n}(2\pi\boldsymbol{k}/\Omega)=CH_{h,1}^{(n)}(2\pi\boldsymbol{k}/\Omega)=Cu_{1}(2\pi k_{1}/\Omega,A_{1})u_{2}(2\pi k_{2}/\Omega,A_{2}).

Here, $C$ is the same constant as above and $u_{1},u_{2}$ are as in (H.2). Therefore, a matrix form of $\{c_{h,n}(\boldsymbol{\omega}):\boldsymbol{\omega}\in D_{\text{grid}}\}$ is $CU_{1}U_{2}^{\top}$ , where

U_{i}=(u_{i}(2\pi(-a_{i})/\Omega,A_{1}),\dots,u_{i}(2\pi(a_{i})/\Omega,A_{1})^{\top}\in\mathbb{C}^{2a_{i}+1}.

These give algorithms for the fast computation of the periodogram on grid.

H.3 Additional Figures

In this section, we provide supplementary figures for the simulation results in Section 7.

H.4 Additional simulations

As discussed in Section 7.3, in case the model misspecifies the true spatial point pattern, the best fitting model may not always accurately estimates the true first-intensity. In this section, we provide a potential remedy to overcome this issue by fitting the ”reduced” model.

For our simulation, we generate the same LGCP model on $\mathbb{R}^{2}$ as in Section 7.3 and fit the Thomas clustering process (TCP) models with parameters $(\kappa,\alpha,\sigma^{2})^{\top}$ as in Section 5.2. However, for each simulated point pattern, we constraint the parameter $\alpha=\widehat{\lambda}/\kappa$ , where $\widehat{\lambda}$ is a nonparameteric unbiased estimator of the ”ture” first-order intensity. We denote this model with constraint as the ”reduced” TCP model. The reduced TCP model has two free parameters $\boldsymbol{\eta}=(\kappa,\sigma^{2})^{\top}$ and the estimated first-order intensity for the fitted reduced TCP model is $\widehat{\lambda}$ . Therefore, the reduced TCP model corrected estimates the true first-order intensity.

For each simulation, we fit the reduced TCP model using three estimation methods: discrete version of our estimator as described in Section 7.1, maximum likelihood-based method using the log-Palm likelihood(ML; Tanaka et al. (2008)), and the minimum contrast method (MC). When evaluating our estimator, we follow the guidelines as in Section 7.1 and consider the two prespecified domains $D_{2\pi}=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:0.1\pi\leq\|\boldsymbol{\omega}\|_{\infty}\leq 2\pi\}$ and $D_{5\pi}=\{\boldsymbol{\omega}\in\mathbb{R}^{2}:0.1\pi\leq\|\boldsymbol{\omega}\|_{\infty}\leq 5\pi\}$ .

Now, we consider the best fitting reduced TCP model. The (discretized) Whittle likelihood of the reduced model is

L^{(R)}(\boldsymbol{\eta})=\sum_{\boldsymbol{\omega}_{\boldsymbol{k},A}\in D}\left(\frac{\widehat{I}_{h,n}(\boldsymbol{\omega}_{\boldsymbol{k},A})}{f_{\boldsymbol{\theta}(\boldsymbol{\eta})}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})}+\log f_{\boldsymbol{\theta}(\boldsymbol{\eta})}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})\right).

(H.3)

Here, $\boldsymbol{\theta}(\boldsymbol{\eta})=(\kappa,\widehat{\lambda}/\kappa,\sigma^{2})^{\top}$ , $A\in\{10,20,40\}$ is the side length of the observation window, and $D\in\{D_{2\pi},D_{5\pi}\}$ . Then, we report $\widehat{\boldsymbol{\eta}}^{(R)}=\arg\min L^{(R)}(\boldsymbol{\eta})$ . Since $\widehat{\lambda}$ varies by simulations, the best fitting reduced TCP parameters also varies by simulations. However, we note that under mild conditions, $\widehat{\lambda}$ consistently estimates the true first-order intensity $\lambda^{(true)}$ , the ”ideal” best reduced TCP parameters are $\boldsymbol{\eta}_{0}(D,A)=\arg\min\mathcal{L}^{(R)}(\boldsymbol{\eta})$ , where

\mathcal{L}^{(R)}(\boldsymbol{\eta})=\sum_{\boldsymbol{\omega}_{\boldsymbol{k},A}\in D}\left(\frac{f(\boldsymbol{\omega}_{\boldsymbol{k},A})}{f_{\widetilde{\boldsymbol{\theta}}(\boldsymbol{\eta})}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})}+\log f_{\widetilde{\boldsymbol{\theta}}(\boldsymbol{\eta})}^{(TCP)}(\boldsymbol{\omega}_{\boldsymbol{k},A})\right),

(H.4)

where $f$ is the true spectral density function as in (7.3) and $\widetilde{\boldsymbol{\theta}}(\boldsymbol{\eta})=(\kappa,\lambda^{(true)}/\kappa,\sigma^{2})^{\top}$ .

Table H.1 summarizes parameter estimation results. The results are also illustrated in Figure H.5. We note that as the observation domain increases, our estimators (for $D_{2\pi}$ and $D_{5\pi}$ ) tend to converge to the corresponding (ideal) best fitting reduced TCP parameters. Whereas, the standard errors of the ML and MC estimator for $\sigma^{2}$ does not seem to significantly decrease to zero even for the sample points for few thousands (corresponds to the observation domain $D_{n}=[-20,20]^{2}$ ). Moreover, there is no clear evidence in Table H.1 and Figure H.5 that the parameter estimates for the ML and MC converge to some fixed non- diverging or non-shrinking parameters.

Window	Par.	Best Par.		Method
Window	Par.	$D_{2\pi}$	$D_{5\pi}$	Ours( $D_{2\pi}$ )	Ours( $D_{5\pi}$ )	ML	MC
$[-5,5]^{2}$	$\kappa$	0.22	0.25	0.38(0.25)	0.42(0.35)	0.28(0.34)	0.25(0.28)
	$\sigma^{2}$	0.09	0.08	0.13(0.12)	0.12(0.13)	0.19(0.15)	0.24(0.17)
	Time(sec)	—	—	0.14	0.71	0.30	0.06
$[-10,10]^{2}$	$\kappa$	0.21	0.24	0.27(0.09)	0.30(0.11)	0.13(0.06)	0.13(0.06)
	$\sigma^{2}$	0.09	0.08	0.10(0.04)	0.09(0.04)	0.34(0.17)	0.36(0.25)
	Time(sec)	—	—	0.47	2.68	0.91	0.16
$[-20,20]^{2}$	$\kappa$	0.21	0.24	0.23(0.05)	0.26(0.06)	0.09(0.03)	0.09(0.03)
	$\sigma^{2}$	0.09	0.08	0.10(0.02)	0.08(0.03)	0.41(0.19)	0.46(0.19)
	Time(sec)	—	—	1.76	11.66	13.21	1.38

Table H.1: The mean and the standard errors (in the parentheses) of the estimated parameters for the misspecified LGCP fitting with the reduced TCP model. The best fitting parameters are calculated by minimizing

\mathcal{L}^{(R)}(\boldsymbol{\theta})

in (H.4). When evaluating our estimator, we use two different prespecified domains:

D_{2\pi}

and

D_{5\pi}

. The time is calculated as an averaged computational time (using a parallel computing in R on a desktop computer with an i7-10700 Intel CPU) of each method per one simulation from 500 independent replications.

Appendix I Spectral methods for nonstationary point processes

I.1 A new DFT for the intensity reweighted process

Recall the $n$ th-order intensity function $\lambda_{n}$ as in (2.1). In this section, we do not presuppose the (second-order) stationarity of the point process $X$ . Instead, we let $X$ be a simple second-order intensity reweighted stationary (SOIRS) point process on $\mathbb{R}^{d}$ (Baddeley et al. (2000)). That is, there exists $\ell_{2}:\mathbb{R}^{d}\rightarrow\mathbb{R}$ such that

\frac{\gamma_{2}(\boldsymbol{x}_{1},\boldsymbol{x}_{2})}{\lambda_{1}(\boldsymbol{x}_{1})\lambda_{1}(\boldsymbol{x}_{2})}=\ell_{2}(\boldsymbol{x}_{1}-\boldsymbol{x}_{2}),\quad\boldsymbol{x}_{1},\boldsymbol{x}_{2}\in\mathbb{R}^{d},

(I.1)

where $\lambda_{1}(\cdot)$ is the first-order intensity that does not need to be a constant. The second-order stationary point processes fit within this framework by setting $\ell_{2}=\lambda^{-2}\gamma_{2,\text{red}}$ , where $\lambda$ is the constant first-order intensity and $\gamma_{2,\text{red}}$ is the reduced second-order cumulant intensity function.

To leverage the Fourier methods developed for the stationary case, we consider a slight variant of the ordinary DFT defined in (2.8). The analogous large sample results for the ordinary DFT under the SOIRS framework are similar, with greater details provided in Ding et al. (2024).

Definition I.1 (Intensity reweighted DFT).

Let $X$ be an SOIRS spatial point process on $D_{n}$ ( $n\in\mathbb{N}$ ) of form (2.6). Then, the intensity reweighted DFT (IR-DFT) with the data taper $h$ is defined as

\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X\cap D_{n}}\frac{h(\boldsymbol{x}/\boldsymbol{A})}{\lambda_{1}(\boldsymbol{x})}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(I.2)

Before investigating the theoretical properties of the IR-DFT, we draw a comparison between the IR-DFT and the ordinary DFT. Firstly, unlike the ordinary DFT, $\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})$ is contingent on the underlying unknown first-order intensity function. Secondly, under stationarity, $\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})$ and $\mathcal{J}_{h,n}(\boldsymbol{\omega})$ in (2.8) are related through $\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda)=\lambda^{-1}\mathcal{J}_{h,n}(\boldsymbol{\omega})$ , where $\lambda$ is the constant first-order intensity. Lastly, by using (2.1), we have $\mathbb{E}[\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})]=c_{h,n}(\boldsymbol{\omega})$ , where $c_{h,n}(\cdot)$ is the bias factor as defined in (2.10). Therefore, the expectation of the IR-DFT is a deterministic function depends solely on the data taper $h$ and the domain $D_{n}$ .

By using the above bias expression, we now can define the theoretical centered IR-DFT and IR-periodogram respectively as

J_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})=\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})-c_{h,n}(\boldsymbol{\omega})\text{~{}~{}and~{}~{}}I_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})=|J_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda_{1})|^{2},\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(I.3)

I.2 Asymptotic properties of the IR-DFT and IR-periodogram

In this section, we study asymptotic properties for the IR-DFT and IR-periodogram. To do so, we adopt a different asymptotic framework compared to the stationary case. This is because if we only rely on Assumption 3.1 as our asymptotic setup for general SOIRS processes, there is no gain in information of $\lambda_{1}(\boldsymbol{x})$ at the fixed location $\boldsymbol{x}\in\mathbb{R}^{d}$ as the domain $D_{n}$ increases. Therefore, in a similar spirit to Dahlhaus (1997), we consider an infill-type asymptotic framework for the first-order intensity function below. For a domain $W\in\mathbb{R}^{d}$ , we use the notation $X_{W}$ to indicate the observations of $X$ are confined within $W$ .

Assumption I.1.

Let $X_{D_{n}}$ ( $n\in\mathbb{N}$ ) be a sequence of SOIRS processes defined on the increasing domain $\{D_{n}\}$ of form (2.6). Let $\lambda_{1,n}(\cdot)$ and $\gamma_{2,n}(\cdot,\cdot)$ be the first- and second-order cumulant intensity functions of $X_{D_{n}}$ , respectively. Then, the following structural assumptions on $\lambda_{1,n}(\cdot)$ and $\gamma_{2,n}(\cdot,\cdot)$ hold:

(i)

For $n\in\mathbb{N}$ , $\lambda_{1,n}(\cdot)$ is a strictly positive function on $D_{n}$ and there exists non-negative function $\lambda(\boldsymbol{x})$ , $\boldsymbol{x}\in\mathbb{R}^{d}$ , with a compact support on $[-1/2,1/2]^{d}$ , such that

$\lambda_{1,n}(\boldsymbol{x})=\lambda(\boldsymbol{x}/\boldsymbol{A}),\quad n\in\mathbb{N},\quad\boldsymbol{x}\in D_{n}.$ (I.4)
(ii)

For $n\in\mathbb{N}$ and $\boldsymbol{x},\boldsymbol{y}\in D_{n}$ , $\gamma_{2,n}(\boldsymbol{x},\boldsymbol{y})/(\lambda_{1,n}(\boldsymbol{x})\lambda_{1,n}(\boldsymbol{y}))=\ell_{2}(\boldsymbol{x}-\boldsymbol{y})$ where $\ell_{2}:\mathbb{R}^{d}\rightarrow\mathbb{R}$ does not depend on $n$ .

Under Assumption I.1(i), the IR-DFT can be written as

\mathcal{J}_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda)=(2\pi)^{-d/2}H_{h,2}^{-1/2}|D_{n}|^{-1/2}\sum_{\boldsymbol{x}\in X_{D_{n}}}\frac{h(\boldsymbol{x}/\boldsymbol{A})}{\lambda(\boldsymbol{x}/\boldsymbol{A})}\exp(-i\boldsymbol{x}^{\top}\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(I.5)

Here, we use the notation $\lambda$ instead of $\lambda_{1}$ to emphasize the asymptotic framework as in Assumption I.1(i). The centered IR-DFT and IR-periodogram, denoted by $J_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda)$ and $I_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda)$ , respectively, can be defined similarly.

Theorem I.1 below addresses the asymptotic uncorrelatedness of the IR-DFTs.

Theorem I.1 (Asymptotic uncorrelatedness of the IR-DFT).

Let $X_{D_{n}}$ ( $n\in\mathbb{N}$ ) be a sequence SOIRS point processes that satisfy Assumption I.1. Suppose that Assumptions 3.1, 3.2 (for $\ell=2$ ), and Assumption 3.4(i) hold. Furthermore, $\lambda(\cdot)$ from (I.4) is strictly positive and continuous on $[-1/2,1/2]^{d}$ . Let $\{\boldsymbol{\omega}_{1,n}\}$ and $\{\boldsymbol{\omega}_{2,n}\}$ be two asymptotic distant sequencies on $\mathbb{R}^{d}$ . Then,

\lim_{n\rightarrow\infty}\mathrm{cov}(J_{h,n}^{(IR)}(\boldsymbol{\omega}_{1,n};\lambda),J_{h,n}^{(IR)}(\boldsymbol{\omega}_{2,n};\lambda))=0.

(I.6)

If we further assume $\lim_{n\rightarrow\infty}\boldsymbol{\omega}_{1,n}=\boldsymbol{\omega}\in\mathbb{R}^{d}$ , then

\lim_{n\rightarrow\infty}\mathrm{var}(J_{h,n}^{(IR)}(\boldsymbol{\omega}_{1,n};\lambda))=\lim_{n\rightarrow\infty}\mathbb{E}[I_{h,n}^{(IR)}(\boldsymbol{\omega};\lambda)]=(2\pi)^{-d}\frac{H_{h^{2}/\lambda,1}}{H_{h,2}}+\mathcal{F}^{-1}(\ell_{2})(\boldsymbol{\omega}).

(I.7)

Proof.

To prove the theorem, we first start with the expression of the covariance of the IR-DFT. The proof of lemma below is almost identical to that of the proof of Lemma D.1 so we omit the details.

Lemma I.1.

Let $X_{D_{n}}$ ( $n\in\mathbb{N}$ ) be a sequence of SOIRS spatial point processes that satisfy Assumption I.1 and let $h$ be data taper such that $\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}h(\boldsymbol{x})<\infty$ . Suppose that Assumption 3.2 for $\ell=2$ holds. Then,

		$\displaystyle\mathrm{cov}(J_{h,n}^{(IR)}(\boldsymbol{\omega}_{1};\lambda),J_{h,n}^{(IR)}(\boldsymbol{\omega}_{2};\lambda))=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\bigg{(}H_{h^{2}/\lambda,1}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})$		(I.8)
		$\displaystyle\quad+\int_{D_{n}^{2}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}\ell_{2}(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}\bigg{)},\quad\boldsymbol{\omega}_{1},\boldsymbol{\omega}_{2}\in\mathbb{R}^{d}.$		(I.8)

Now, by utilizing expression (I.8) above, proofs of (I.6) and (I.7) are almost identical to those in the proof of Ding et al. (2024), Theorem 4.1 (we omit the details). ∎

Under second-order stationarity, an expectation of the periodogram converges to the spectral density function. Bearing this in mind, along with the limiting behavior in (I.7), we define the intensity reweighted pseudo-spectral density functions SOIRS processes.

Definition I.2 (Intensity reweighted pseudo-spectral density function).

Let $X_{D_{n}}$ ( $n\in\mathbb{N}$ ) be a sequence of SOIRS spatial point processes that satisfy Assumption I.1. Suppose $\ell_{2}$ in (I.1) belongs to $L^{1}(\mathbb{R}^{d})$ . Then, the intensity reweighted pseudo-spectral density function (IR-PSD) of $X_{D_{n}}$ corresponding to the data taper $h$ is defined as

f_{h}^{(IR)}(\boldsymbol{\omega})=(2\pi)^{-d}\frac{H_{h^{2}/\lambda,1}}{H_{h,2}}+\mathcal{F}^{-1}(\ell_{2})(\boldsymbol{\omega}),\quad\boldsymbol{\omega}\in\mathbb{R}^{d}.

(I.9)

It follows from (I.7) that $f_{h}^{(IR)}$ is an even and non-negative function on $\mathbb{R}^{d}$ . However, unlike the classical spectral density function, the IR-PSD $f_{h}^{(IR)}$ depends on the specify data taper function $h$ . Under stationarity, $f_{h}^{(IR)}$ equals $\lambda^{-2}f$ , where $f$ is the spectral density.

In the theorem below, we derive the asymptotic joint distribution of the theoretical IR-DFTs and IR-periodograms. The proof is almost identical to that of Theorem 3.2, so we omit the details.

Theorem I.2 (Asymptotic joint distribution of the IR-DFTs and IR-periodograms).

Let $X_{D_{n}}$ ( $n\in\mathbb{N}$ ) be a sequence of SOIRS spatial point processes that satisfy Assumption I.1. Suppose that Assumptions 3.1, 3.2 (for $\ell=4$ ), 3.3(i), and 3.4(i) hold. Furthermore, $\lambda(\cdot)$ from (I.4) is strictly positive and continuous on $[-1/2,1/2]^{d}$ . For a fixed $r\in\mathbb{N}$ , $\{\boldsymbol{\omega}_{1,n}\}$ , …, $\{\boldsymbol{\omega}_{r,n}\}$ denote $r$ sequences on $\mathbb{R}^{d}$ that satisfy conditions (1) and (3) in the statement of Theorem 3.2. Then,

\left(\frac{J_{h,n}^{(IR)}(\boldsymbol{\omega}_{1,n};\lambda)}{(\frac{1}{2}f_{h}^{(IR)}(\boldsymbol{\omega}_{1}))^{1/2}},\dots,\frac{J_{h,n}^{(IR)}(\boldsymbol{\omega}_{r,n};\lambda)}{(\frac{1}{2}f_{h}^{(IR)}(\boldsymbol{\omega}_{r}))^{1/2}}\right)\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}(Z_{1},\dots,Z_{r}),\quad n\rightarrow\infty,

where $\{Z_{k}\}_{k=1}^{r}$ are independent standard normal random variables on $\mathbb{C}$ . By using continuous mapping theorem, we conclude

\left(\frac{I_{h,n}^{(IR)}(\boldsymbol{\omega}_{1,n};\lambda)}{\frac{1}{2}f_{h}^{(IR)}(\boldsymbol{\omega}_{1})},\dots,\frac{I_{h,n}^{(IR)}(\boldsymbol{\omega}_{r,n};\lambda)}{\frac{1}{2}f_{h}^{(IR)}(\boldsymbol{\omega}_{r})}\right)\stackrel{{\scriptstyle\mathcal{D}}}{{\rightarrow}}(\chi^{2}_{1},\dots,\chi^{2}_{r}),\quad n\rightarrow\infty,

where $\{\chi^{2}_{k}\}_{k=1}^{r}$ are independent chi-squared random variables with degrees of freedom two.

		$\displaystyle\|D_{n}\|^{-1}\sup_{\boldsymbol{\omega}\in\mathbb{R}^{d}}\|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$		(C.7)
		$\displaystyle~{}~{}\leq\|D_{n}\|^{-1}\sup_{\boldsymbol{\omega}}h(\boldsymbol{\omega})\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}\left\|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right\|d\boldsymbol{x}$
		$\displaystyle~{}~{}\leq C\sup_{\boldsymbol{x}\in D_{n}\cap(D_{n}-\boldsymbol{t})}\left\|g((\boldsymbol{x}+\boldsymbol{t})/\boldsymbol{A})-g(\boldsymbol{x}/\boldsymbol{A})\right\|.$

	$\displaystyle\|R_{h,g,1}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$	$\displaystyle\leq$	$\displaystyle\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\|\exp(2\pi i\boldsymbol{k}^{\top}(\boldsymbol{t}/\boldsymbol{A}))-1\|\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|$
		$\displaystyle\leq$	$\displaystyle C\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\\|\boldsymbol{k}\\|\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|.$

	$\displaystyle\left\|\int_{D_{n}\cap(D_{n}-\boldsymbol{t})}e^{-i\boldsymbol{x}^{\top}(-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})}d\boldsymbol{x}\right\|$	$\displaystyle=$	$\displaystyle(2\pi)^{d/2}\|D_{n}\cap(D_{n}-\boldsymbol{t})\|^{1/2}\|c_{D_{n,0}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
		$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{1/2}\|c_{D_{n,0}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$

$\displaystyle\|R_{h,g,2}^{(n)}(\boldsymbol{t},\boldsymbol{\omega})\|$	$\displaystyle\leq$	$\displaystyle(2\pi)^{d/2}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\sum_{i=1}^{m_{d}}\|D_{n,i}(\boldsymbol{t})\|^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
	$\displaystyle\leq$	$\displaystyle(2\pi)^{d/2}\sum_{i=1}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\|D_{n}\backslash(D_{n}-\boldsymbol{t})\|^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|$
	$\displaystyle\leq$	$\displaystyle C\|D_{n}\|^{1/2}\sum_{i=1}^{m_{d}}\sum_{\boldsymbol{j},\boldsymbol{k}\in\mathbb{Z}^{d}}\|h_{\boldsymbol{j}}\|\|g_{\boldsymbol{k}}\|\rho(\\|\boldsymbol{t}/\boldsymbol{A}\\|)^{1/2}\|c_{D_{n,i}(\boldsymbol{t})}(\boldsymbol{\omega}-2\pi(\boldsymbol{j}+\boldsymbol{k})/\boldsymbol{A}+\boldsymbol{\omega})\|.$

	$\displaystyle\mathrm{cov}(J_{h,n}(\boldsymbol{\omega}_{1}),J_{h,n}(\boldsymbol{\omega}_{2}))$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{2d}}h(\boldsymbol{x}/\boldsymbol{A})h(\boldsymbol{y}/\boldsymbol{A})e^{-i(\boldsymbol{x}^{\top}\boldsymbol{\omega}_{1}-\boldsymbol{y}^{\top}\boldsymbol{\omega}_{2})}C(\boldsymbol{x}-\boldsymbol{y})d\boldsymbol{x}d\boldsymbol{y}$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}d\boldsymbol{u}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})\int_{\mathbb{R}^{d}}h((\boldsymbol{u}+\boldsymbol{v})/\boldsymbol{A})h(\boldsymbol{v}/\boldsymbol{A})e^{-i\boldsymbol{v}^{\top}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})}d\boldsymbol{v}$
	$\displaystyle~{}~{}=(2\pi)^{-d}H_{h,2}^{-1}\|D_{n}\|^{-1}\int_{\mathbb{R}^{d}}d\boldsymbol{u}e^{-i\boldsymbol{u}^{\top}\boldsymbol{\omega}_{1}}C(\boldsymbol{u})\left(H_{h,2}^{(n)}(\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})+R_{h,h}^{(n)}(\boldsymbol{u},\boldsymbol{\omega}_{1}-\boldsymbol{\omega}_{2})\right).$