Residual spectrum: Brain functional connectivity detection beyond coherence

Yuichi Gotolabel=e1 [ mark]yuichi.goto@math.kyushu-u.ac.jp Xuze Zhanglabel=e2 [ mark]xzhang51@umd.edu Benjamin Kedemlabel=e3 [ mark]bnk@umd.edu Shuo Chenlabel=e4 [ mark]shuochen@umd.edu Faculty of Mathematics, Kyushu University Department of Mathematics and Institute for Systems Research, University of Maryland , Maryland Psychiatric Research Center, School of Medicine, University of Maryland

Abstract

Coherence is a widely used measure to assess linear relationships between time series. However, it fails to capture nonlinear dependencies. To overcome this limitation, this paper introduces the notion of residual spectral density as a higher-order extension of the squared coherence. The method is based on an orthogonal decomposition of time series regression models. We propose a test for the existence of the residual spectrum and derive its fundamental properties. A numerical study illustrates finite sample performance of the proposed method. An application of the method shows that the residual spectrum can effectively detect brain connectivity. Our study reveals a noteworthy contrast in connectivity patterns between schizophrenia patients and healthy individuals. Specifically, we observed that non-linear connectivity in schizophrenia patients surpasses that of healthy individuals, which stands in stark contrast to the established understanding that linear connectivity tends to be higher in healthy individuals. This finding sheds new light on the intricate dynamics of brain connectivity in schizophrenia.

62M15,

62M10,

62G10,

Coherence,

Spectral density,

Time series,

Frequency domain,

fMRI,

keywords:

[class=MSC]

keywords:

\startlocaldefs\endlocaldefs

, and

1 Introduction

Functional connectivity within the human brain is important for the understanding of brain disorders such as depression, Alzheimer’s disease, autism, dyslexia, dementia, epilepsy, and other types of mental disorder. The complete map of the human brain functional connectivity, or neural pathways, is referred to as the connectome. In describing the topology of the human brain, functional connectivity is commonly calculated by characterizing the co-variation of blood oxygen level-dependent (BOLD) signals, or time series, between two distinct neural populations or regions. To investigate the functional architecture of the brain, time series from resting-state functional magnetic resonance imaging (fMRI) which measures spontaneous low-frequency fluctuations in BOLD signals are often used.

Various time and spectral domain methods have been used in the analysis of BOLD signals. Among them is the wide spread measure of coherence (see, e.g., Müller et al., (2001) and Wang et al., (2015)). Coherence is a time series index used to determine if two or more brain regions are spectrally connected, that is, if two or more brain regions have “similar neuronal oscillatory activity” (Bowyer,, 2016). See Brockwell and Davis, (1991, Section 11, p.436) for the definition. However, coherence measures the strength of linear relationships only, but fails to measure nonlinear relationships, especially, the spectral contributions of quadratic interaction terms. The reasons why nonlinearities should be incorporated into the analysis of brain connections are explained in Zhao et al., (2019, p.1572) and Cao et al., (2022, Section 5.3).

The drawback is rectified by the so called maximum residual coherence, which provides the contribution of specific interaction terms to co-variation of brain regions, over and beyond the coherence by itself. Its origin can be traced back to the orthogonal decomposition of linear and quadratic functionals in the spectral domain proposed by Kimelfeld, (1974). Based on the decomposition, Kedem-Kimelfeld, (1975) developed a selection criterion, lagged coherence, to determine the inclusion of lagged processes or interaction terms in a linear system. This criterion is a function of frequency so that “we may run into a situation where one lag maximizes the coherence over a certain frequency band while another lag maximizes it over a different band” (Kedem-Kimelfeld,, 1975). The maximum residual coherence was developed to resolve such issue and shown to be effective in both cases of continuous and binary time series (Khan et al., 2014 and Kedem, 2016). The criterion is defined using supremum with respect to frequency, while Zhang and Kedem, (2021) proposed an alternative criterion using integration with respect to frequency to choose interaction terms in a more general setting.

Although the method based on maximum residual coherence is promising, no theoretical framework for its use in testing problems exists so far. Therefore, in this paper, we propose a new spectrum, referred to as the residual spectrum to capture non-linear association, and fill the gap by proposing, as a special case, a test for the existence of the interaction terms in covariation. We demonstrate that the disparity in the test results between between healthy individual and schizophrenia patients of our residual spectrum exceeds than that of the coherence. Additionally, we found that the non-linear connectivity for schizophrenia patients is higher than that for healthy individuals, whereas it is known that the reverse is true for linear connectivity.

In addition to coherence, Mohanty et al., (2020) listed alternative measures that can capture co-variation between brain signals in different ways including wavelet coherence, mutual information, dynamic time wraping, and more. Regarding other extensions of coherence, Ombao and Van Bellegem, (2008) proposed the local band coherence to deal with non-stationary signals. Fiecas and Ombao, (2011) studied brain functional connectivity by using the partial coherence estimated by the generalized shrinkage estimator, which is based on both nonparametric and parametric spectral density estimation. Lenart, (2011) dealt with the magnitude of coherence for almost periodically correlated time series and Baruník and Kley, (2019) proposed the quantile coherence based on quantile spectra. Euan et al., (2019) proposed a cluster coherence to measure similarity between vector time series and applied to electroencephalogram (EEG) signals. See also Matsuda, (2006) for a test to construct graphical models in the frequency domain and Liu et al., (2021) for the concept of local Granger causality and its associated testing procedures.

However, our goal is to highlight a way to extend and enhance the measure of coherence in the analysis of BOLD signals by incorporating interaction terms. Cao et al., (2022) gave a comprehensive review of recent developments on statistical methods to analyze brain functional connectivity.

For hypothesis testing problems described by spectra, $L_{2}$ - and supremum-type statistics have been studied by many authors. As for $L_{2}$ -type statistics, Taniguchi et al., (1996) discussed the test based on the integrated function of spectral density matrices (see also Taniguchi and Kakizawa, 2000, Chapter 6), which can be applied to, e.g., the test for the magnitude of linear dependence and discriminant analysis (Kakizawa et al., (1998)). A closely related problem was examined in Yajima and Matsuda, (2009) under Gaussian assumption. To include other hypotheses, Eichler, (2008) considered the testing problem described by the integrated squared (Euclidean) norm of the function of spectral density matrices. In the same spirit of Eichler, (2008), tests for the equality of spectra are proposed using the bootstrap method (Dette and Paparoditis,, 2009) and the randomization method (Jentsch and Pauly,, 2015). On the other hand, supremum-type statistics are considered by Woodroofe and Van Ness, (1967), Hannan, (1970, Thorem 12, Chapter V.5), Rudzkis, (1993), and Wu and Zaffaroni, (2018). However, it is difficult to obtain asymptotics of supremum statistics for the function of a spectral density matrix in general, but it can for $L_{2}$ -statistics. Therefore, in this article, we propose an $L_{2}$ -type statistic.

The paper is organized as follows. Section 2 introduce the residual spectrum, which is the main objective in this paper. Section 3 delves into time series multiple regression models and assumptions, and show there exists a unique orthogonal representation of the model to derive the residual spectrum and provide an interpretation of it. In Section 4, we propose a test for the existence of the residual spectrum. The asymptotic null distribution of our test statistic and consistency of the test are derived. The consistency of estimators of unknown parameters in the model is also addressed. Section 5 provides the finite sample performance of our method. Section 6 shows the utility of our method by applying the proposed test to brain data. All proofs of Lemma and Theorems in the main article are presented in Appendix 6.

2 Residual spectrum

In this section, we introduce a new spectrum, termed the residual spectrum of order $j$ , for the vector stationary process $(X_{0}(t),X_{1}(t),\ldots,X_{K}(t))$ , designed to capture both linear and non-linear terms of $(X_{1}(t),\ldots,X_{K}(t)$ in explaining $X_{0}(t)$ . For the sake of brevity in notation, denote the spectral density matrix for $(X_{0}(t),X_{1}(t),\ldots,X_{K}(t))$ and $(X_{1}(t),\ldots,X_{K}(t))$ by $\bm{f}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=0,\ldots,K}$ and $\bm{f}_{K}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=1,\ldots,K}$ , respectively.

Under the regularity conditions, as derived in the subsequent section, the residual spectrum $f_{G_{j}G_{j}}$ of order $j$ is defined as follows: for $j=1$ ,

\displaystyle f_{G_{1}G_{1}}(\lambda):=\frac{\left|f_{10}(\lambda)\right|^{2}}{f_{11}(\lambda)}\quad\text{and }f_{G_{j}G_{j}}(\lambda)=\frac{\left|A_{jj}\left(e^{i\lambda}\right)\right|^{2}{\rm det}\left(\bm{f}_{j}(\lambda)\right)}{{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)}\quad\text{for $j=2,\ldots,K$,}

where, for $j\in\{2,\ldots,K\}$ ,

\displaystyle A_{jj}\left(e^{i\lambda}\right):=\frac{-\sum_{i=1}^{j-1}{\rm det}\left(\overline{\bm{f}_{i,j}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)f_{j0}(\lambda)}{{\rm det}\left(\bm{f}_{j}(\lambda)\right)}

and

{\bm{f}}_{i,j}^{\ddagger}(\lambda):=\begin{pmatrix}{f}_{11}(\lambda)&\cdots&{f}_{1(i-1)}&{f}_{1j}&{f}_{1(i+1)}&\cdots&{f}_{1(j-1)}\\ \vdots&&\vdots&\vdots&\vdots&&\vdots\\ {f}_{(j-1)1}(\lambda)&\cdots&{f}_{(j-1)(i-1)}&{f}_{(j-1)j}&{f}_{(j-1)(i+1)}&\cdots&{f}_{(j-1)(j-1)}\\ \end{pmatrix}.

The interpretation of our spectrum, as elucidated in the subsequent section, is as follows:

1.

$f_{G_{1}G_{1}}(\lambda)$ quantifies the strength of the linear association between $X_{0}(t)$ and $X_{1}(t)$ .
2.

$f_{G_{2}G_{2}}(\lambda)$ measures the strength of the linear relationship between $X_{0}(t)$ and $X_{2}(t)$ after eliminating the linear effect of $X_{1}(t)$ .
3.

$f_{G_{3}G_{3}}(\lambda)$ indicates the strength of the linear association between $X_{0}(t)$ and $X_{3}(t)$ after removing the linear effects of $X_{1}(t)$ and $X_{2}(t)$ .
4.

Similarly, $f_{G_{j}G_{j}}(\lambda)$ represents the strength of the linear relationship between $X_{0}(t)$ and $X_{j}(t)$ subsequent to the removal of the linear effects of $X_{1}(t),\ldots,X_{j-1}(t)$ .

While one might initially presume that $f_{G_{j}G_{j}}(\lambda)$ is incapable of capturing non-linear relationships, it does indeed possess this capability. We shall elaborate on the integration of non-linear relationships through our residual spectrum. Let $X_{0}(t)$ and $X_{1}(t)$ be stationary processes and consider taking $X_{2}(t)$ as a non-linear process of $X_{1}(t)$ , for instance, $X_{2}(t)=X_{1}(t)X_{1}(t+u)$ for some predetermined time lag $u$ . In this scenario, $f_{G_{2}G_{2}}(\lambda)$ accounts for the contribution of the non-linear terms $X_{2}(t)$ which cannot be measured by the linear term $X_{1}(t)$ . The quantity $f_{G_{1}G_{1}}(\lambda)+f_{G_{2}G_{2}}(\lambda)$ can be interpreted as the spectral density, encompassing not solely the linear association of $X_{1}(t)$ but also the non-linear association of $X_{2}(t)$ . This extension can be generalized to higher-order scenarios in a similar fashion. In the following section, we shall provide a derivation of our residual spectrum and elucidate the rationale behind its interpretation as aforementioned.

3 Orthogonal decomposition

We consider the model

\displaystyle X_{0}(t):=\zeta+\sum_{i=1}^{K}\sum_{k_{i}=-\infty}^{\infty}b_{i}(k_{i})X_{i}({t-k_{i}})+\epsilon(t),

(1)

where $X_{0}(t)$ is a response variable, $\zeta$ is intercept, $\bm{X}(t):=(X_{1}(t),\ldots,X_{K}(t))^{\top}$ is a covariate process with autocovariance matrix $\Gamma_{\bm{X}}(u):={\rm E}\bm{X}(t)\bm{X}({t-u})$ such that, for any $i=1,\ldots,K$ , $X_{i}(t):=X_{i}^{\prime}-{\rm E}X_{i}^{\prime}$ where $(X_{1}^{\prime}(t),\ldots,X_{K}^{\prime}(t))^{\top}$ a $s$ -th order stationary process for any $s\in\mathbb{N}$ , $\epsilon(t)$ is an i.i.d. centered disturbance process independent of $\{\bm{X}{(t)};t\in\mathbb{Z}\}$ with moments of all orders, and $\sum_{k=-\infty}^{\infty}k^{2}|b_{i}(k)|<\infty$ for all $i=1,\ldots,K$ . The continuous version of this model was considered by Jenkins and Watts, (1968, Section 11.4.1, p.485).

For any random vector $\bm{W}(t):=(W_{1}(t),\cdots,W_{\ell}(t))^{\top}$ and $t_{1},\ldots,t_{\ell}\in\mathbb{Z}$ , the cumulant of order $\ell$ of $(W_{1}(t_{1}),\cdots,W_{\ell}(t_{\ell}))$ is defined as

	$\displaystyle{\rm cum}(W_{1}(t_{1}),\cdots,W_{\ell}(t_{\ell})):=$
	$\displaystyle\sum_{(\nu_{1},\ldots,\nu_{p})}(-1)^{p-1}(p-1)!$	$\displaystyle\left({\rm E}\prod_{j\in\nu_{1}}W_{\nu_{1}}(t_{\nu_{1}})\right)\ldots\left({\rm E}\prod_{j\in\nu_{p}}W_{\nu_{p}}(t_{\nu_{p}})\right),$

where the summation $\sum_{(\nu_{1},\ldots,\nu_{p})}$ extends over all partitions $(\nu_{1},\ldots,\nu_{p})$ of $\{1,2,\cdots,\ell\}$ (see Brillinger,, 1981, p.19). In this paper, we make the following assumption.

Assumption 3.1.

For any integer $\ell\geq 2$ and $(i_{1},\ldots,i_{\ell})\in\{1,\ldots,K\}^{\ell}$ , it holds that

\sum_{s_{2},\ldots,s_{\ell}=-\infty}^{\infty}\left(1+\sum_{j=2}^{\ell}\left\lvert s_{j}\right\rvert^{2}\right)\left\lvert\kappa_{i_{1},\ldots,i_{\ell}}(s_{2},\ldots,s_{\ell})\right\rvert<\infty,

(2)

where $\kappa_{{i_{1}},\ldots,{i_{\ell}}}(s_{2},\ldots,s_{\ell})={\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}$ .

In conjunction with $\sum_{k=-\infty}^{\infty}k^{2}|b_{i}(k)|<\infty$ for all $i=1\ldots,K$ and the existence of moments of all orders for $\epsilon(t)$ , Assumption 3.1 implies the cumulant summability (2) for any integer $\ell\geq 2$ and $(i_{1},\ldots,i_{\ell})\in\{0,1,\ldots,K\}^{\ell}$ . The condition about the summability of cumulant is standard in this context, e.g., Brillinger, (1981, Assumption 2.6.2), Taniguchi et al., (1996, Assumption 2), Eichler, (2008, Assumption 3.1), Shao, (2010, Section 4), Jentsch and Pauly, (2015, Assumption 2.1), and Aue and Van Delft, (2020, Section 4). The sufficient condition for the summability of the cumulant is often discussed. The summability of the fourth-order cumulant under $\alpha$ -mixing condition was shown by Andrews, (1991, Lemma 1). For univariate alpha-mixing processes, Neumann, (1996, Remark 3.1), Lee and Subba Rao, (2017, Remark 3.1), and Bücher et al., (2020, Lemma 2) showed the summability condition holds for higher-order cases. The essential tool of the proof is Theorem 3 of Statulevicius and Jakimavicius, (1988), which provides the upper bound of higher-order cumulant in terms of alpha-mixing coefficients and moments for univariate processes. Thus, the theorem is not applicable for multivariate processes. Besides, Panaretos and Tavakoli, 2013a (, Proposition 4.1) showed the MA( $\infty$ ) process meets the summability condition. On the other hand, Kley, (2014, Lemmas 4.1 and 4.2) gave a sufficient condition for the summability of cumulant for indicator functions of the processes in a beautiful way (see also Kley et al., (2016, Proposition 3.1)). Employing his idea, we obtain the following lemma.

Lemma 3.1.

For the geometrically $\alpha$ -mixing strictly stationary vector process $\bm{X}(t):=(X_{1}(t),\ldots,X_{K}(t))^{\top}$ with moments of all orders in the sense that

\sup_{s_{2},s_{3},\ldots,s_{\ell}\in\mathbb{Z}}\left\lvert{\rm E}X_{i_{1}}(0)X_{i_{2}}(s_{2})\cdots X_{i_{\ell}}(s_{\ell})\right\rvert<\infty

for any $\ell\in\mathbb{N}$ and any $(i_{1},\ldots,i_{\ell})\in\{1,\ldots,K\}^{\ell}$ with $\alpha$ -mixing coefficient $\alpha(\cdot)$ such that $\alpha(n)\leq C_{\alpha}\rho^{n}$ , where

\displaystyle\alpha(n):=\sup_{k\in\mathbb{Z},A\in\mathcal{F}_{-\infty}^{k},\ B\in\mathcal{F}_{k+n}^{\infty}}|{\rm P}(AB)-{\rm P}(A){\rm P}(B)|,

for $a\leq b$ , $\mathcal{F}_{a}^{b}$ is the $\sigma$ -field generated by $\{\bm{X}(t):a\leq t\leq b\}$ , and some constants $C_{\alpha}\in(1,\infty)$ and $\rho\in(0,1)$ , it holds, for any $d\in\mathbb{N}$ , any $\ell\in\mathbb{N}$ , and any $(i_{1},\ldots,i_{\ell})\in\{1,\ldots,K\}^{\ell}$ ,

\displaystyle\sum_{s_{2},\ldots,s_{\ell}=-\infty}^{\infty}\left(1+\sum_{j=2}^{\ell}\left\lvert s_{j}\right\rvert^{d}\right)\left\lvert{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}\right\rvert<\infty.

(3)

Therefore, the geometrically $\alpha$ -mixing strictly stationary vector process with the appropriate moment condition holds under Assumption 3.1. We present two examples.

Example 3.1 (vector ARMA process).

First example satisfying Assumption 3.1 is the causal vector ARMA ( $p$ , $q$ ) process $\bm{X}(t):=(X_{1}(t),\ldots,X_{K}(t))^{\top}$ , defined as follows: for $p,q\in\mathbb{N}$ ,

\displaystyle\bm{\Phi}(B)\bm{X}(t)=\bm{\Theta}(B)\bm{\xi}(t),

(4)

where $\bm{\Phi}(z):=\bm{I}_{K}-\sum_{k=1}^{p}\bm{\Phi}_{k}z^{k}$ , $\bm{\Theta}(z):=\bm{I}_{K}+\sum_{k=1}^{q}\bm{\Theta}_{k}z^{k}$ , $\bm{\Phi}_{k}$ and $\bm{\Theta}_{k}$ are $K$ -by- $K$ matrices satisfying ${\rm det}(\bm{\Phi}(z))\neq 0$ for all $|z|\leq 1$ , $B$ is the backward shift operator, $\bm{\xi}(t):=(\xi_{1}(t),\ldots,\xi_{K}(t))^{\top}$ is i.i.d. random variables with moments of all orders. Then, $\bm{X}(t)$ satisfies Assumption 3.1 since ARMA process has the geometrically $\alpha$ -mixing property (Mokkadem,, 1988) and innovation process satisfies the moment condition stated in Lemma 3.1.

Example 3.2 (Volterra series).

Second example is the non-linear time series defined, for the geometrically $\alpha$ -mixing strictly stationary $K^{\prime}$ -dimensional process $\bm{\upsilon}(t):=(\upsilon_{1}(t),\ldots,\upsilon_{K^{\prime}}(t))^{\top}$ with the moment condition stated in Lemma 3.1 and for $i_{j,d,s}\in\{1,\ldots,K^{\prime}\}$ $(j\in\{1,\ldots,K\},d\in\{1,\ldots,D\},s\in\{1,\ldots,d\})$ , as

\bm{X}(t):=(X_{1}(t),\ldots,X_{K}(t))^{\top}

and

	$\displaystyle X_{j}(t):=$	$\displaystyle\sum_{k_{1}=-J}^{J}c_{j,1}(k_{1})\upsilon_{i_{j,1,1}}(t-k_{1})+\sum_{k_{1},k_{2}=-J}^{J}c_{j,2}(k_{1},k_{2})\upsilon_{i_{j,2,1}}(t-k_{1})\upsilon_{i_{j,2,2}}(t-k_{2})$
		$\displaystyle+\sum_{k_{1},k_{2},k_{3}=-J}^{J}c_{j,3}(k_{1},k_{2},k_{3})\upsilon_{i_{j,3,1}}(t-k_{1})\upsilon_{i_{j,3,2}}(t-k_{2})\upsilon_{i_{j,3,3}}(t-k_{3})+\cdots$
		$\displaystyle+\sum_{k_{1},\ldots,k_{j}=-J}^{J}c_{j,j}(k_{1},\ldots,k_{D})\upsilon_{i_{j,j,1}}(t-k_{1})\cdots\upsilon_{i_{j,j,j}}(t-k_{j})$
	$\displaystyle=$	$\displaystyle\sum_{d=1}^{j}\sum_{k_{1},\ldots,k_{d}=-J}^{J}c_{j,d}(k_{1},\ldots,k_{d})\prod_{s=1}^{d}\upsilon_{i_{j,d,s}}(t-k_{s}),$

where $J$ is a positive integer, $c_{j,d}(k_{1},\ldots,k_{d})\in\mathbb{R}$ . This process is called the Volterra series (Brillinger,, 1981, Equation 2.9.14). Assumption 3.1 holds true for this process $\bm{X}(t)$ since the mixing property of $\bm{X}(t)$ is inherited from $\bm{\upsilon}(t)$ (see Francq and Zakoian, 2010, p.349) and therefore, the conditions of Lemma 3.1 are fulfilled.

Under Assumption 3.1, the spectral density matrix $\bm{f}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=0,\ldots,K}$ for the process $(X_{0}(t),X_{1}(t),\ldots,X_{K}(t))$ exists and twice continuously differentiable. For $i\in\{0,\ldots,K\}$ , $f_{ii}(\lambda)$ is an auto-spectrum for the process $X_{i}(t)$ and, for $i_{1},i_{2}(\neq i_{2})\in\{0,\ldots,K\}$ , $f_{i_{1}i_{2}}(\lambda)$ is a cross-spectrum for the processes $X_{i_{1}}(t)$ and $X_{i_{2}}(t)$ .

The next theorem shows the model (1) has the following orthogonal representation.

Theorem 3.1.

Suppose that $\bm{f}_{K}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=1,\ldots,K}$ is non-singular for all $\lambda\in[-\pi,\pi]$ and ${{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)f_{j0}(\lambda)\neq\sum_{i=1}^{j-1}{\rm det}\left(\overline{\bm{f}_{i,j}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)}$ for $j\in\{2,\ldots,K-1\}$ and for all $\lambda\in[-\pi,\pi]$ , where

{\bm{f}}_{i,j}^{\ddagger}(\lambda):=\left({\bm{f}_{j-1,1}^{\flat}(\lambda)},\ldots,{\bm{f}_{j-1,i-1}^{\flat}(\lambda)},{\bm{f}_{j-1,j}^{\flat}(\lambda)},{\bm{f}_{j-1,i+1}^{\flat}(\lambda)},\ldots,{\bm{f}_{j-1,j-1}^{\flat}(\lambda)}\right)

with $\bm{f}_{a,b}^{\flat}:=({f}_{1b}(\lambda),\ldots,{f}_{ab}(\lambda))^{\top}$ . There uniquely exists the processes $\{G_{j}(t)\}$ for $j=1,\ldots,K$ , which taking the form of

\displaystyle G_{j}(t):=\sum_{d=1}^{j}\sum_{k_{d}=-\infty}^{\infty}a_{jd}(k_{d})X_{d}(t-k_{d})

(5)

such that $X_{0}(t)=\zeta+\sum_{j=1}^{K}G_{j}(t)+\epsilon(t)$ , $G_{i}$ and $G_{j}$ are orthogonal for any $i,j(\neq i)\in\{1,\ldots,K\}$ , i.e., ${\rm E}G_{i}(t)G_{j}(t^{\prime})=0$ for any $t,t^{\prime}\in\mathbb{Z}$ , $\sum_{k=-\infty}^{\infty}|a_{jd}(k)|<\infty$ for any $j\in\{1,\ldots,K\}$ and $d\in\{1,\ldots,j\}$ , and the transfer function $A_{jj}\left(e^{-\mathrm{i}\lambda}\right):=\sum_{k=-\infty}^{\infty}a_{jj}(k)e^{-\mathrm{i}k\lambda}$ satisfies $A_{jj}\left(e^{-\mathrm{i}\lambda}\right)\neq 0$ for all $\lambda\in[-\pi,\pi]$ and $j\in\{1,\ldots,K-1\}$ . The unique expressions of $G_{j}$ is given by (5), where $a_{jd}$ is determined through the transfer functions $A_{jd}(e^{-\mathrm{i}\lambda})$ , as

\displaystyle A_{11}\left(e^{i\lambda}\right):=\frac{f_{10}(\lambda)}{f_{11}(\lambda)}\quad\text{for $j=d=1$},

(6)

for $j=d\in\{2,\ldots,K\}$ ,

\displaystyle A_{jj}\left(e^{i\lambda}\right):=\frac{-\sum_{i=1}^{j-1}{\rm det}\left(\overline{\bm{f}_{i,j}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)f_{j0}(\lambda)}{{\rm det}\left(\bm{f}_{j}(\lambda)\right)},

(7)

and, for $j\in\{2,\ldots,K\}$ and $d\in\{1,\ldots,j-1\}$ ,

\displaystyle A_{jd}\left(e^{i\lambda}\right):=

\displaystyle-\frac{{\rm det}\left(\bm{f}_{d,j}^{\ddagger}(\lambda)\right)}{{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)}A_{jj}\left(e^{i\lambda}\right).

(8)

From Theorem 3.1, the spectral density for $G_{j}$ is given by

\displaystyle f_{G_{1}G_{1}}(\lambda)=\frac{\left|f_{10}(\lambda)\right|^{2}}{f_{11}(\lambda)}\quad\text{and }f_{G_{j}G_{j}}(\lambda)=\frac{\left|A_{jj}\left(e^{i\lambda}\right)\right|^{2}{\rm det}\left(\bm{f}_{j}(\lambda)\right)}{{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)}\quad\text{for $j=2,\ldots,K$.}

By construction, $G_{j}$ can be interpreted as a process after eliminating the effect of the processes $G_{1},\ldots,G_{j-1}$ . Thus, we refer to $f_{G_{j}G_{j}}$ as a residual spectral density of order $j$ . These spectra are related to the important indexes.

When $K=1$ , the squared coherence $\mathcal{C}_{1}(\lambda)$ is defined, for $\lambda\in[-\pi,\pi]$ , as

\displaystyle\mathcal{C}_{1}(\lambda):=\frac{f_{G_{1}G_{1}}(\lambda)}{f_{00}(\lambda)},

which is the time series extension of the correlation, measures linear dependence between $\{X_{0}(t)\}$ and $\{X_{1}(t)\}$ (see, e.g., Brockwell and Davis, 1991, p.436).

In the case of $K=2$ , $G_{2}$ has the spectral density of the form

\displaystyle f_{G_{2}G_{2}}(\lambda)=\frac{\left|f_{11}(\lambda)f_{02}(\lambda)-f_{12}(\lambda)f_{01}(\lambda)\right|^{2}}{f_{11}(\lambda)\left({f_{11}}(\lambda){f_{22}}(\lambda)-\left|f_{12}(\lambda)\right|^{2}\right)}.

In particular, when $X_{2}(t)$ is defined as the lagged process $X_{2}(t):=X_{1}(t)X_{1}(t-u)$ , the lagged spectral coherence $\mathcal{C}_{2}(\lambda,u)$ is defined, for $\lambda\in[-\pi,\pi]$ , as

\displaystyle\mathcal{C}_{2}(\lambda,u):=\mathcal{C}_{1}(\lambda)+\frac{f_{G_{2}G_{2}}(\lambda,u)}{f_{00}(\lambda)}.

The lagged coherence $\mathcal{C}_{2}(\lambda,u)$ measures the nonlinear (quadratic) dependence between $\{X_{0}(t)\}$ and $\{X_{1}(t)\}$ . See details in Kedem-Kimelfeld, (1975). In this case, we also refer to $f_{G_{2}G_{2}}$ as a lagged spectral density.

As a higher-order extension of squared coherence, it is natural to define a squared coherence of order $d$ as

\displaystyle\mathcal{C}_{d}(\lambda):=\mathcal{C}_{d-1}(\lambda)+\frac{f_{G_{d}G_{d}}(\lambda)}{f_{00}(\lambda)}=\frac{\sum_{i=1}^{d}f_{G_{i}G_{i}}(\lambda)}{f_{00}(\lambda)}.

Since $f_{00}(\lambda)=\sum_{i=1}^{K}f_{G_{i}G_{i}}(\lambda)+{{\rm Var}(\epsilon(t))}/{(2\pi)}$ , $\mathcal{C}_{d}(\lambda)\in(0,1)$ for any $\lambda\in[-\pi,\pi]$ .

The residual spectral density is closely related to the partial cross spectrum, which was introduced by Dahlhaus, (2000), and is the cross spectrum after removing the linear effect of a certain process (see also Fiecas and Ombao, (2011, Section 2) and Eichler, (2008, p.969)). The partial cross spectrum of $X_{0}$ and $X_{2}$ given $X_{1}$ is defined by

\displaystyle f_{02|1}(\lambda):=f_{02}(\lambda)-f_{01}(\lambda)f_{11}^{-1}(\lambda)f_{12}(\lambda)

and we can see the relationship between $f_{G_{2}G_{2}}$ and $f_{02|1}$ :

\displaystyle f_{G_{2}G_{2}}(\lambda):=\frac{f_{11}(\lambda)|f_{02|1}(\lambda)|^{2}}{{f_{11}}(\lambda){f_{22}}(\lambda)-\left|f_{12}(\lambda)\right|^{2}}.

Thus, we can interpret $f_{G_{2}G_{2}}$ as the rescaled squared partial cross spectrum of $X_{0}$ and $X_{2}$ given $X_{1}$ .

Remark 3.1.

The coherence and the lagged (spectral) coherence can be derived from the model

X_{0}(t):=\sum_{k=-\infty}^{\infty}b(k)X_{1}(t-k)+\epsilon(t)

and

	$\displaystyle X_{0}(t):=$	$\displaystyle\sum_{k_{1}=-\infty}^{\infty}b_{1}(k_{1})X_{1}(t-k_{1})$
		$\displaystyle+\sum_{k_{2}=-\infty}^{\infty}b_{2}(k_{2})\left(X_{1}(t-k_{2})X_{1}(t-k_{2}-u)-{\rm E}X_{1}(t-k_{2})X_{1}(t-k_{2}-u)\right)+\epsilon(t).$

For more details, see Brockwell and Davis, (1991, Example 11.6.4) and Kedem-Kimelfeld, (1975). In this sense, our residual spectrum serves as a natural extension of those spectra.

4 Test for the existence of residual spectral density

We are interested in the existence of $X_{K}$ in the model (1), that is, $b_{K}(k)=a_{KK}(k)=0$ for any $k\in\mathbb{Z}$ under the presence of $X_{1},\ldots,X_{K-1}$ . To this end, we consider the following hypothesis testing problem: the null hypothesis is

\displaystyle H_{0}:f_{G_{K}G_{K}}(\lambda)=0\quad\text{$\lambda$-a.e. on $[-\pi,\pi]$}

(9)

and the alternative hypothesis is $K_{0}:H_{0}$ does not hold. The null hypothesis $H_{0}$ holds true if and only if that $a_{KK}(k)=0$ for all $k\in\mathbb{Z}$ since $a_{KK}(k):=\int_{-\pi}^{\pi}A_{KK}\left(e^{-\mathrm{i}\lambda}\right)e^{-\mathrm{i}k\lambda}{\rm d}\lambda/(2\pi)$ and the proof of Theorem 3.1 shows $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ if and only if $f_{G_{K}G_{K}}(\lambda)=0.$

Suppose the observed stretch $\{X_{j}(t);t=1,\ldots,n;j=0,\ldots,K\}$ is available. To estimate $\bm{f}(\lambda)$ , we define the kernel spectral density estimator $\hat{\bm{f}}(\lambda):=\left(\hat{f}_{ij}(\lambda)\right)_{i,j=0,1,\ldots,K}$ and

\displaystyle\hat{f}_{ij}(\lambda):=\frac{1}{2\pi}\sum_{h=1+u-n}^{n-1-u}\omega\left(\frac{h}{M_{n}}\right)\hat{{\gamma}}_{ij}(h)e^{-\mathrm{i}h\lambda}\quad\lambda\in[-\pi,\pi],

where $M_{n}$ is a bandwidth parameter satisfies Assumption 4.1 (A1), $\omega(\cdot)$ is the lag window function defined as $\omega(x):=\int_{-\infty}^{\infty}W(t)e^{\mathrm{i}xt}\textrm{ d}t$ , $W(\cdot)$ is the kernel function satisfy Assumption 4.1 (A2), for $h\in\{0,\ldots,n-1-u\}$ ,

\hat{{\gamma}}_{ij}(h):=\frac{1}{n-u-h}\sum_{t=1}^{n-u-h}(X_{i}{(t+h)}-X_{i}(.))({X}_{j}({t})-{{X}_{j}({.})}),

and, for $h\in\{-n+1+u,\ldots,-1\}$ ,

\hat{{\gamma}}_{ij}(h):=\frac{1}{n-u+{h}}\sum_{t=-h+1}^{n-u}(X_{i}{(t+h)}-X_{i}(.))({X}_{j}({t})-{{X}_{j}({.})}),

with ${{X}_{i}{(.)}}=\sum_{t=1}^{n-u}{X}_{i}({t})/{{(n-u)}}$ for $i=0,1,\ldots,K$ .

Assumption 4.1.

(A1)

A bandwidth parameter satisfies $n/M_{n}^{9/2}\to 0$ and $n/M_{n}^{3}\to\infty$ as $n\to\infty$ .
(A2)

The kernel function $W(\cdot)$ is a bounded, nonnegative, even, real, Lipschitz continuous function such that $\int_{-\infty}^{\infty}W(t)\textrm{ d}t=1$ , $\int_{-\infty}^{\infty}t^{2}W(t)\textrm{d}t<\infty$ , $\limsup_{t\to\infty}t^{2}W(t)=0$ , $\eta_{\omega,2}:=\int_{-\infty}^{\infty}\omega^{2}(x)\textrm{d}x<\infty$ , $\eta_{\omega,4}:=\int_{-\infty}^{\infty}\omega^{4}(x)\textrm{d}x<\infty$ , and $\sum_{h=-n+1}^{n-1}\omega\left({h}/{M_{n}}\right)=O(M_{n})$ .

The condition (A1) is slight stronger than Eichler, (2008, Assumption 3.3 (iii)), that is, $n/M_{n}^{9/2}\to 0$ and $n/M_{n}^{2}\to\infty$ as $n\to\infty$ . This is because that he dealt with the zero mean process. On the other hand, we need to estimate ${\rm E}X_{i}^{\prime}$ in the kernel density estimator. In the same reason, we assume the condition $\sum_{h=-n+1}^{n-1}\omega\left({h}/{M_{n}}\right)=O(M_{n})$ in (A2).

From the fact $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ if and only if $f_{G_{K}G_{K}}(\lambda)=0$ , together with the non-singularity of $\bm{f}_{K}(\lambda)$ and (7), the hypothesis (9) is equivalent to

\displaystyle H_{0}:\Phi_{K}\left({\bm{f}}(\lambda)\right)=0\quad\text{$\lambda$-a.e. on $[-\pi,\pi]$},

and subsequently equivalent to

\displaystyle H_{0}:\mathop{\text{\large$\int_{\text{\normalsize$\scriptstyle\kern-1.04996pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits\left|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right|^{2}=0,

where

\displaystyle\Phi_{K}\left(\bm{f}(\lambda)\right):=

\displaystyle-\sum_{i=1}^{K-1}{\rm det}\left(\overline{\bm{f}_{i,K}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{K-1}(\lambda)\right)f_{K0}(\lambda).

Our proposed test statistic is defined as

\displaystyle T_{n}:=\frac{n}{\sqrt{M_{n}}}\mathop{\text{\large$\int_{\text{\normalsize$\scriptstyle\kern-1.04996pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits\left|\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)\right|^{2}{\rm d}\lambda-\hat{\mu}_{n,K},

(10)

where

	$\displaystyle\hat{\mu}_{n,K}:=$	$\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\mathop{\text{\huge$\int_{\text{\normalsize$\scriptstyle\kern-2.45004pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits{\rm tr}\left(\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}\hat{\bm{f}}(\lambda)\overline{\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}}\hat{\bm{f}}(\lambda)\right){\rm d}\lambda,$
	$\displaystyle\hat{\bm{f}}_{i,j}^{\ddagger}(\lambda):=$	$\displaystyle\left({\hat{\bm{f}}_{j-1,1}^{\flat}(\lambda)},\ldots,{\hat{\bm{f}}_{j-1,i-1}^{\flat}(\lambda)},{\hat{\bm{f}}_{j-1,j}^{\flat}(\lambda)},{\hat{\bm{f}}_{j-1,i+1}^{\flat}(\lambda)},\ldots,{\hat{\bm{f}}_{j-1,j-1}^{\flat}(\lambda)}\right),$
	$\displaystyle\hat{\bm{f}}_{a,b}^{\flat}:=$	$\displaystyle(\hat{f}_{1b}(\lambda),\ldots,\hat{f}_{ab}(\lambda))^{\top},\quad\text{and }\hat{\bm{f}}_{K-1}(\lambda):=\left(\hat{f}_{ij}(\lambda)\right)_{i,j=1,\ldots,K-1}.$

The term $\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)$ and $\hat{\mu}_{n,K}$ are the estimated numerator of $A_{KK}$ and a bias correction, respectively. The next theorem shows the asymptotic null distribution of $T_{n}$ .

Theorem 4.1.

Suppose Assumptions 3.1 and 4.1, $\bm{f}_{K}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=1,\ldots,K}$ is non-singular for all $\lambda\in[-\pi,\pi]$ and ${{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)f_{j0}(\lambda)\neq\sum_{i=1}^{j-1}{\rm det}\left(\overline{\bm{f}_{i,j}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)}$ for $j\in\{2,\ldots,K-1\}$ and for all $\lambda\in[-\pi,\pi]$ . It holds that, under the null $H_{0}$ , $T_{n}$ defined in (10) converges in distribution to the centered normal distribution with variance $\sigma_{K}^{2}$ as $n\to\infty$ , where

	$\displaystyle\sigma_{K}^{2}:=4\pi\eta_{\omega,4}\mathop{\text{\huge$\int_{\text{\normalsize$\scriptstyle\kern-2.45004pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits$	$\displaystyle\left\|{\rm tr}\left(\left.\frac{\partial\Phi_{K}\left({\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}={\bm{f}}(\lambda)}{\bm{f}}(\lambda)\overline{\left.\frac{\partial\Phi_{K}\left({\bm{f}}(\lambda)\right)}{\partial\bm{Z}}\right\|_{{\bm{Z}}={\bm{f}}(\lambda)}}{\bm{f}}(\lambda)\right)\right\|^{2}$
		$\displaystyle+\left\|{\rm tr}\left(\left.\frac{\partial\Phi_{K}\left({\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}={\bm{f}}(\lambda)}{\bm{f}}(\lambda)\left.\frac{\partial\Phi_{K}\left({\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}={\bm{f}}(\lambda)}{\bm{f}}(\lambda)\right)\right\|^{2}{\rm d}\lambda.$

Remark 4.1.

For $K=1,2,$ $\Phi_{K}\left(\bm{f}(\lambda)\right)$ is given by

\displaystyle\Phi_{1}\left(\bm{f}(\lambda)\right):=

\displaystyle f_{10}(\lambda)\quad\text{and }\quad\Phi_{2}\left(\bm{f}(\lambda)\right):=f_{21}(\lambda)f_{10}(\lambda)-f_{11}(\lambda)f_{20}(\lambda),

respectively. Additionally, the concrete forms of the bias and variance terms are given by, for $K=1$ ,

\displaystyle\mu_{1}:=

\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}f_{11}(\lambda)f_{00}(\lambda){\rm d}\lambda\quad\text{and }\sigma_{1}^{2}:=4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}\left|f_{11}(\lambda)f_{00}(\lambda)\right|^{2}{\rm d}\lambda

and, for $K=2$ ,

\displaystyle\mu_{2}:=

\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}\left(f_{11}(\lambda)f_{00}(\lambda)-\left|f_{10}(\lambda)\right|^{2}\right)\left(f_{11}(\lambda)f_{22}(\lambda)-\left|f_{12}(\lambda)\right|^{2}\right){\rm d}\lambda

and

\displaystyle\sigma_{2}^{2}:=4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}\left|\left(f_{11}(\lambda)f_{00}(\lambda)-\left|f_{10}(\lambda)\right|^{2}\right)\left(f_{11}(\lambda)f_{22}(\lambda)-\left|f_{12}(\lambda)\right|^{2}\right)\right|^{2}{\rm d}\lambda.

For $K\geq 3$ ,

	$\displaystyle\mu_{K}:=$	$\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}(\det{\bm{f}_{K-1}(\lambda)})^{2}\left(f_{00}(\lambda)-\overline{\hat{\bm{f}}_{K-1,0}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda){\hat{\bm{f}}_{K-1,0}^{\flat}}(\lambda)\right)$
		$\displaystyle\quad\quad\quad\quad\quad\times\left(f_{KK}(\lambda)-\overline{\hat{\bm{f}}_{K-1,K}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda)\hat{\bm{f}}_{K-1,K}^{\flat}(\lambda)\right){\rm d}\lambda$
	$\displaystyle\text{and }\sigma_{K}^{2}:=$	$\displaystyle 4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}\left\|(\det{\bm{f}_{K-1}(\lambda)})^{2}\left(f_{00}(\lambda)-\overline{\hat{\bm{f}}_{K-1,0}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda){\hat{\bm{f}}_{K-1,0}^{\flat}}(\lambda)\right)\right.$
		$\displaystyle\quad\quad\quad\quad\quad\times\left.\left(f_{KK}(\lambda)-\overline{\hat{\bm{f}}_{K-1,K}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda)\hat{\bm{f}}_{K-1,K}^{\flat}(\lambda)\right)\right\|^{2}{\rm d}\lambda.$

A consistent estimator $\hat{\sigma}^{2}_{K}$ of $\sigma^{2}_{K}$ can be constructed by

	$\displaystyle\hat{\sigma}_{K}^{2}:=4\pi\eta_{\omega,4}\mathop{\text{\huge$\int_{\text{\normalsize$\scriptstyle\kern-2.45004pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits$	$\displaystyle\left\|{\rm tr}\left(\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}\hat{\bm{f}}(\lambda)\overline{\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}}\hat{\bm{f}}(\lambda)\right)\right\|^{2}$
		$\displaystyle+\left\|{\rm tr}\left(\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}\hat{\bm{f}}(\lambda)\left.\frac{\partial\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)}{\partial\bm{Z}^{\top}}\right\|_{{\bm{Z}}=\hat{\bm{f}}(\lambda)}\hat{\bm{f}}(\lambda)\right)\right\|^{2}{\rm d}\lambda$

since $\max_{\lambda\in[-\pi,\pi]}\left\|\hat{\bm{f}}(\lambda)-\bm{f}(\lambda)\right\|_{2}$ , where $\|\cdot\|_{2}$ denotes the Euclid norm, converges in probability to zero as $n\to\infty$ (see, e.g., Robinson,, 1991, Theorem 2.1). From Theorem 4.1, the test which rejects $H_{0}$ whenever $T_{n}/{\hat{\sigma}_{K}}\geq z_{\alpha}$ , where $z_{\alpha}$ denotes the upper $\alpha$ -percentile of the standard normal distribution, has asymptotically size $\alpha$ . The following theorem shows the power of the test tends to one as $n\to\infty$ .

Theorem 4.2.

For the case that $K=2$ and we choose the lagged process $X_{2}(t):=X_{1}(t)X_{1}(t-u)$ as a second covariate process, the lag $u$ must be determined by a data analyst. In order to to choose the plausible lag, we propose the following criterion: denote the spectral density matrix $\bm{f}(\lambda)$ as $\bm{f}(\lambda,u)$ to emphasize the dependence on $u$ . For some $L\in\mathbb{N}$ , we define

\displaystyle\hat{u}:=\operatorname{\textrm{argmax}}_{u\in\{0,\ldots,L\}}\int_{-\pi}^{\pi}q\left(\hat{\bm{f}}(\lambda,u)\right){\rm d}\lambda,

where $q$ is some function. The intuitive choice of $q$ would be $q\left(\hat{\bm{f}}(\lambda)\right):=\hat{f}_{G_{2}G_{2}}(\lambda,u)$ , where

\displaystyle\hat{f}_{G_{2}G_{2}}(\lambda,u):=\frac{\left|\hat{f}_{11}(\lambda)\hat{f}_{20}(\lambda,u)-\hat{f}_{21}(\lambda,u)\hat{f}_{10}(\lambda)\right|^{2}}{\hat{f}_{11}(\lambda)\left({\hat{f}_{11}}(\lambda){\hat{f}_{22}}(\lambda,u)-\left|\hat{f}_{21}(\lambda,u)\right|^{2}\right)}.

This choice is minimizing the mean square error ${\rm E}\epsilon_{t}^{2}$ since ${\rm E}\epsilon_{t}^{2}=\int_{-\pi}^{\pi}f_{ZZ}(\lambda)-f_{G_{1}G_{1}}(\lambda)-f_{G_{2}G_{2}}(\lambda,u){\rm d}\lambda$ . If a data analyst is interested in the maximum peak of $f_{G_{2}G_{2}}(\lambda,u)/f_{00}(\lambda)$ , we can choose $q$ as

\displaystyle q\left(\hat{\bm{f}}(\lambda)\right):=\delta\left(\lambda-\operatorname{\textrm{argmax}}_{\omega\in[-\pi,\pi]}\left(\frac{\hat{f}_{G_{2}G_{2}}(\omega,u)}{\hat{f}_{00}(\omega)}\right)\right)\frac{\hat{f}_{G_{2}G_{2}}(\omega,u)}{\hat{f}_{00}(\omega)},

where $\delta$ is the Dirac delta function, which corresponds to the choice based on the residual coherence (Khan et al.,, 2014). Similarly, for the case that $K$ is greater than 2 and we choose the covariate $X_{K}(t)$ depends on the lag $u_{1},\ldots,u_{K-1}$ , e.g., $\prod_{j=1}^{K-1}X_{j}(t-u_{j})$ , we propose the criterion

\displaystyle(\hat{u}_{1},\ldots,\hat{u}_{K-1}):=\operatorname{\textrm{argmax}}_{(u_{1},\ldots,u_{K-1})\in\{0,\ldots,L\}^{K-1}}\int_{-\pi}^{\pi}q\left(\hat{\bm{f}}(\lambda,u_{1},\ldots,u_{K-1})\right){\rm d}\lambda.

Remark 4.2.

One may be skeptical of handling $\hat{u}_{n}$ as a known constant in the testing procedure. In that case, we can consider the following hypothesis, for candidates of lags $u_{1},\ldots,u_{L}\in\mathbb{N}\cup\{0\}$ ,

\displaystyle\tilde{H}_{0}:f_{G_{K}G_{K}}(\lambda,u)=0\quad\text{$\lambda$-a.e. on $[-\pi,\pi]$ for all $u\in\{u_{1},\ldots,u_{L}\}$.}

(11)

and $\tilde{K}_{0}:\tilde{H}_{0}$ does not hold, where $L$ is a given constant, and the corresponding test statistic is defined by

\displaystyle\tilde{T}_{n}:=\frac{n}{\sqrt{M_{n}}}\int_{-\pi}^{\pi}\left\|\bm{\Phi}\left(\tilde{\bm{f}}(\lambda)\right)\right\|_{2}^{2}{\rm d}\lambda-\hat{\tilde{\mu}}_{n,K},

(12)

where $\|\cdot\|_{2}$ denotes the Euclid norm, $\tilde{\bm{f}}(\lambda)$ is the spectral density matrix of

$(X_{0}(t),X_{1}(t),\ldots,X_{K-1}(t),X_{K,u_{1}}(t),\ldots,X_{K,u_{L}}(t))$ , $X_{K,u_{j}}(t)$ is the $K$ -th covariate corresponding to the lag $u_{j}$ ,

\bm{\Phi}\left(\tilde{\bm{f}}(\lambda)\right):=\left(\Phi_{K,u_{1}}\left(\tilde{\bm{f}}(\lambda)\right),\ldots,\Phi_{K,u_{L}}\left(\tilde{\bm{f}}(\lambda)\right)\right)^{\top},\quad\Phi_{K,u_{j}}\left(\tilde{\bm{f}}(\lambda)\right):=\Phi_{K}\left(\hat{\bm{f}}(\lambda,u_{j})\right),

\hat{\tilde{\mu}}_{n,K}:=\sqrt{M_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}{\rm tr}\left\{\Gamma_{\bm{\Phi}}(\lambda)\left(\tilde{\bm{f}}^{\top}(\lambda)\otimes\tilde{\bm{f}}(\lambda)\right)\right\}{\rm d}\lambda,

$\hat{\bm{f}}(\lambda,u_{j})$ is the kernel density estimator for the spectral density matrix of the process $(X_{0}(t),X_{1}(t),\ldots,X_{K-1}(t),X_{K,u_{j}}(t))$ , and

\displaystyle\Gamma_{\bm{\Phi}}(\lambda):=\sum_{k=1}^{L}{\rm vec}\left.\left(\frac{\overline{\partial{\Phi_{K,u_{k}}}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right){\rm vec}\left(\frac{{\partial{\Phi_{K,u_{k}}}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right)^{\top}\right|_{{\bm{Z}}=\tilde{\bm{f}}(\lambda)}.

Then, $\tilde{T}_{n}$ converges in distribution to the centered normal distribution with variance $\tilde{\sigma}^{2}_{K}$ as $n\to\infty$ , where

\displaystyle\tilde{\sigma}^{2}_{K}:=

\displaystyle 4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}{\rm tr}\left\{\Gamma_{\bm{\Phi}}(\lambda)\left(\tilde{\bm{f}}^{\top}(\lambda)\otimes\tilde{\bm{f}}(\lambda)\right)\left(\Gamma_{\bm{\Phi}}(\lambda)+\Gamma_{\bm{\Phi}}^{\top}(-\lambda)\right)\left(\tilde{\bm{f}}^{\top}(\lambda)\otimes\tilde{\bm{f}}(\lambda)\right)\right\}{\rm d}\lambda.

Then, the test which rejects $\tilde{H}_{0}$ whenever $T_{n}/{\hat{\tilde{\sigma}}_{K}}\geq z_{\alpha}$ has asymptotically size $\alpha$ and is consistent, where $\hat{\tilde{\sigma}}_{K}$ is a consistent estimator of $\tilde{\sigma}_{K}$ . The proof is omitted since it is analogous to the proof of Theorems 4.1 and 4.2.

One may be interested in the parameters $\{b_{i}(k);k\in\mathbb{Z},i\in\{1,\ldots,K\}\}$ . Before closing this section, we construct estimators of $\{b_{i}(k);k\in\mathbb{Z},i\in\{1,\ldots,K\}\}$ . The estimator of $\{b_{i}(k)$ can be defined, for any $k\in\mathbb{Z}$ , by $\hat{b}_{i}(k):=\sum_{j=i}^{K}\hat{a}_{ji}(k),$ where, for $i,j(\leq i)$ , $\hat{a}_{ij}(k):=\int_{-\pi}^{\pi}\hat{A}_{ij}\left(e^{-\mathrm{i}\lambda}\right)e^{-\mathrm{i}k\lambda}{\rm d}\lambda/(2\pi)$ , where $\hat{A}_{ij}\left(e^{-\mathrm{i}\lambda}\right)$ is defined as (6)–(8) but the spectral density is replaced with the kernel density estimator. Then, we have the consistency of estimators.

Theorem 4.3.

Suppose Assumptions 3.1 and 4.1, $\bm{f}_{K}(\lambda):=\left(f_{ij}(\lambda)\right)_{i,j=1,\ldots,K}$ is non-singular for all $\lambda\in[-\pi,\pi]$ and ${{\rm det}\left(\bm{f}_{j-1}(\lambda)\right)f_{j0}(\lambda)\neq\sum_{i=1}^{j-1}{\rm det}\left(\overline{\bm{f}_{i,j}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)}$ for $j\in\{2,\ldots,K-1\}$ and for all $\lambda\in[-\pi,\pi]$ . For any $k\in\mathbb{Z}$ , the estimators $\hat{b}_{i}(k)$ for $i\in\{1,\ldots,K\}$ converges in probability to $b_{i}(k)$ as $n\to\infty$ .

5 Numerical study

This section presents the finite sample performance of the proposed test. Let $\{\epsilon_{j}(t);t\in\mathbb{Z}\}$ be an i.i.d. standard normal random variable, where $j=0,\ldots,4$ and each variable is independent of the others, $\{X_{j}(t);t\in\mathbb{Z}\}$ for $j=1,\ldots,3$ be the AR(1) model defined as $X_{j}(t)=0.4X_{j}(t-1)+\epsilon_{j}(t)$ , and $\{X_{4}(t);t\in\mathbb{Z}\}$ be the process given by $X_{4}(t)=X_{2}(t)+\epsilon_{4}(t)$ . We consider 14 cases presented in Table 1. Cases 1, 4, and 7 are associated with to the null hypothesis, while the other cases are related to the alternative hypothesis. Cases 1–3, 4–5, and 6–10 are associated with linear terms only, and correspond to tests with $K=1$ , $K=2$ , and $K=3$ , respectively. In contrast, Cases 11–14 involve non-linear (interaction) terms and correspond to tests with $K=2$ . It is worth noting that Case 10 is close to the null since our test is designed to remove the effect, in the sense of orthogonality, of the processes that are already included in the model.

Table 1: The models and tests for numerical simulations.

Case	model	test
1	$X_{0}(t)=\epsilon_{0}(t)$	the existence of $X_{1}$
2	$X_{0}(t)=0.05X_{1}(t)+\epsilon_{0}(t)$	the existence of $X_{1}$
3	$X_{0}(t)=0.1X_{1}(t)+\epsilon_{0}(t)$	the existence of $X_{1}$
4	$X_{0}(t)=X_{1}(t)+\epsilon_{0}(t)$	the existence of $X_{2}$ under the presence of $X_{1}$
5	$X_{0}(t)=X_{1}(t)+0.05X_{2}(t)+\epsilon_{0}(t)$	the existence of $X_{2}$ under the presence of $X_{1}$
6	$X_{0}(t)=X_{1}(t)+0.1X_{2}(t)+\epsilon_{0}(t)$	the existence of $X_{2}$ under the presence of $X_{1}$
7	$X_{0}(t)=X_{1}(t)+X_{2}(t)+\epsilon_{0}(t)$	the existence of $X_{3}$ under the presence of $X_{1},X_{2}$
8	$X_{0}(t)=X_{1}(t)+X_{2}(t)+0.05X_{3}(t)+\epsilon_{0}(t)$	the existence of $X_{3}$ under the presence of $X_{1},X_{2}$
9	$X_{0}(t)=X_{1}(t)+X_{2}(t)+0.1X_{3}(t)+\epsilon_{0}(t)$	the existence of $X_{3}$ under the presence of $X_{1},X_{2}$
10	$X_{0}(t)=X_{1}(t)+X_{2}(t)+0.05X_{4}(t)+\epsilon_{0}(t)$	the existence of $X_{4}$ under the presence of $X_{1},X_{2}$
11	$X_{0}(t)=X_{1}(t)+0.05X_{1}^{2}(t)+\epsilon_{0}(t)$	the existence of $X_{1}(t)^{2}$ under the presence of $X_{1}$
12	$X_{0}(t)=X_{1}(t)+0.05X_{1}^{2}(t)+\epsilon_{0}(t)$	the existence of $X_{1}(t)X_{1}(t-1)$ under the presence of $X_{1}$
13	$X_{0}(t)=X_{1}(t)+0.05X_{1}^{2}(t)+\epsilon_{0}(t)$	the existence of $X_{1}(t)X_{1}(t-2)$ under the presence of $X_{1}$
14	$X_{0}(t)=X_{1}(t)+0.05X_{1}^{2}(t)+\epsilon_{0}(t)$	the existence of $X_{1}(t)X_{1}(t-3)$ under the presence of $X_{1}$

The sample size and significance level is set to $n=250,500,1000,2000$ and $0.05$ , respectively. For each case, we generate a time series with length of $n$ and apply our test. We iterate this procedure 1000 times and calculate empirical size or power of the test.

The results are given in Table 2. The empirical size (corresponds to Cases 1, 4, 7) is acceptable. The empirical power for Cases 2, 3, 5, 6, 8, 9, 11, increases as the sample size gets larger. For Case 10, the empirical power is small, as expected. For Cases 12–14, which are the tests for the existence of $X_{1}(t)X_{1}(t-1),X_{1}(t)X_{1}(t-2),X_{1}(t)X_{1}(t-3)$ in the model, respectively, the power is smaller as the lag of the lagged process increases, which is reasonable. Overall, the proposed test shows good performance.

For Case 12, $X_{0}(t)=X_{1}(t)+0.05X_{1}^{2}(t)+\epsilon_{0}(t)$ can be decomposed into three components

(i)

$X_{1}(t)+0.05{\rm P}_{\overline{\rm sp}\{X_{1}(t);t\in\mathbb{Z}\}}X_{1}^{2}(t)$
(ii)

$0.05{\rm P}_{\overline{\rm sp}\{X_{1}(t)X_{1}(t-1);t\in\mathbb{Z}\}}\left(X_{1}^{2}(t)-{\rm P}_{\overline{\rm sp}\{X_{1}(t);t\in\mathbb{Z}\}}X_{1}^{2}(t)\right)$
(iii)

$\epsilon_{0}(t)+0.05(X_{1}^{2}(t)-{\rm P}_{\overline{\rm sp}\{X_{1}(t);t\in\mathbb{Z}\}}X_{1}^{2}(t)$ $-{\rm P}_{\overline{\rm sp}\{X_{1}(t);t\in\mathbb{Z}\}}(X_{1}^{2}(t)-{\rm P}_{\overline{\rm sp}\{X_{1}(t);t\in\mathbb{Z}\}}X_{1}^{2}(t))$

where ${\rm P}_{A}$ and $\overline{\rm sp}A$ denote the projection operator onto the set $A$ and the closed span of $A$ , respectively. This decomposition tell us that the rejection probability of the test is not $0.05$ when the model includes a second input (in our case, $X_{1}(t)^{2}$ ) that is not the one we wish to test ( $X_{1}(t)X_{1}(t-1)$ ) since the second input partially encompasses the input that we intend to test (the term (ii)) unless they are orthogonal, which is not the case for $X_{1}(t)^{2}$ and $X_{1}(t)X_{1}(t-1)$ . Therefore, Case 12 corresponds to the alternative. Cases 13 and 14 are similar. One may notice that (iii) is not i.i.d. disturbance process any more. Since independence can be relaxed to orthogonality between the error term and covariates and it can be time series, our test can be applied.

Table 2: Empirical size and power of the proposed test for each case.

Case $\backslash$ $n$	250	500	1000	2000
1	0.068	0.085	0.066	0.068
2	0.156	0.224	0.333	0.532
3	0.369	0.631	0.865	0.990
4	0.069	0.075	0.069	0.059
5	0.148	0.214	0.284	0.430
6	0.353	0.550	0.768	0.964
7	0.069	0.074	0.061	0.062
8	0.140	0.186	0.291	0.420
9	0.306	0.511	0.736	0.936
10	0.114	0.13	0.173	0.214
11	0.213	0.277	0.465	0.720
12	0.148	0.212	0.315	0.510
13	0.075	0.108	0.145	0.201
14	0.081	0.075	0.077	0.107

6 Empirical study

Our data consist of fMRI time series from 61 healthy individuals (controls) and 49 schizophrenia patients (cases), where in both cases the brain was divided into 246 regions based on the Brainnetome atlas (see Fan et al., (2016)), each giving a region-level time series of length 148. The data are described in Culbreth et al., (2021). Figure 1 shows the plots of time series for 61 healthy individuals and 49 schizophrenia patients for the region 211, respectively.

Refer to caption — Figure 1: Plots of time series for 41 healthy individuals (the left panel) and 49 schizophrenia patients (the right panel) for the region 211.

The regions 211–246 correspond to subcortical nuclei. Many research has been conducted for the association between schizophrenia and subcortical abnormalities, for example, Fan et al., (2019). Thus, it is of particular interest to detect structural differences within these regions between healthy people and patients.

We investigate whether time series from region $i\in\{211,\ldots,246\}$ can be explained by time series from the other region $j\in\{211,\ldots,246\}\backslash\{i\}$ through two scenarios. In the first scenario (i), we apply our test based on $T_{n}$ defined in (10) for the hypothesis (9) with $K=1$ to time series $X_{0}$ and $X_{1}$ observed from regions $i\in\{211,\ldots,246\}$ and from $j\in\{211,\ldots,246\}\backslash\{i\}$ for healthy individuals and patients, respectively. In the second scenario (ii), we investigate the hypothesis (9) and apply our test based on $T_{n}$ with $K=2$ and $X_{2}(t)=X_{1}(t)X_{1}(t-\hat{u})$ , where

\displaystyle\hat{u}:=\operatorname{\textrm{argmax}}_{u\in\{0,\ldots,5\}}\int_{-\pi}^{\pi}\hat{f}_{G_{2}G_{2}}(\lambda,u){\rm d}\lambda

(see the discussion below Theorem 4.2), to time series $X_{0}$ and $X_{1}$ observed from regions $i\in\{163,\ldots,188\}$ and from $j\in\{163,\ldots,188\}\backslash\{i\}$ for healthy individuals and patients, respectively. It is assumed that the linear term $\{X_{1}(t)\}$ has already been incorporated into included in the model. Note that the scenarios (i) and (ii) correspond to tests for $f_{G_{1}G_{1}}(\lambda)=0$ and $f_{G_{2}G_{2}}(\lambda)=0$ for $\lambda$ a.e. on $[-\pi,\pi]$ , respectively.

Figure 2 illustrates the rejection probabilities (regarding individuals) of the test results described in scenario (i) for both healthy individuals and patients, respectively. The regions can be divided in terms of gyrus regions (See Table 3). These regions are emphasized by blue lines in Figure 2. The regions belonging to the Amygdala and the Hippocampus are well connected, as are the regions belonging to the Thalamus. Additionally, the regions belonging to the Basal Ganglia are well connected, with the exception of regions 227 and 228. The rejection probabilities for regions 227 and 228 in the Basal Ganglia to the other regions are relatively low. Similarly, those for regions 242–244 in the Thalamus to the other regions in the Amygdala, Hippocampus, and Basal Ganglia are relatively low. These regions are emphasized by yellow dotted lines in Figure 2.

Gyrus Region	Region Numbers
Amygdala	211–214
Hippocampus	215–218
Basal Ganglia	219–230
Thalamus	231–246

Table 3: Corresponding Gyrus Regions of Our Dataset

Figure 3 presents the corresponding the rejection probability for scenario (ii). The rejection probability for each pixel is explicitly displayed. In each panel, the left and bottom margins display the region numbers $i\in\{211,\ldots,246\}$ corresponding to time series $X_{0}$ and $X_{1}$ from brain region $i$ , respectively. Additionally, it appears that the rejection probabilities of inter-gyrus regions for patient tend to be higher than those for healthy individuals.

Table 4 provides a concise summary comparing the rejection probabilities between healthy individuals and patients. As expected, Figure 2 and Table 4 indicate that the rejection probabilities for patients tend to be lower than those for healthy individuals, consistent with findings in Liu et al., (2018), with several pixels yielding a rejection probability of one. Conversely, Figure 3 and Table 4 demonstrates that the rejection probabilities for patients tend to be higher than those for healthy individuals.

Table 4: Comparison of Rejection Probabilities between healthy individuals and patients

Probability	Outcome
Probability	a test for $f_{G_{1}G_{1}}$	a test for $f_{G_{2}G_{2}}$
Patients $>$ Healthy	30.00%	62.14%
Patients $<$ Healthy	48.25%	37.86%
Patients $=$ Healthy	21.75%	0.00%

In summary, our findings suggest that there are statistically significant linear relationships between fMRI data from different regions of the brain, consistent with prior research. Furthermore, beyond these linear relationships, our analysis reveals the presence of statistically significant nonlinear associations, which contribute to the explanation of fMRI data beyond the scope of linear relationships alone. Interestingly, our investigation indicates that non-linear functional connectivity tends to be higher among schizophrenia patients compared to healthy subjects. This contrasts with the findings in the linear case, where patients tend to exhibit lower functional connectivity than healthy subjects when considering only linear measures.

Appendix A Proofs

A.1 Proof of Lemma 2.1

First, we derive the upper bound for the quantity

{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}.

Its derivation is analogous to Kley et al., (2016, Proposition 3.1). In the case $s_{2}=\dots=s_{\ell}=0$ , there exists, by the moment assumption, a constant $C_{\ell}$ such that

\displaystyle\sup_{i_{1},\ldots,i_{\ell}\in\{0,1,\ldots,K\}}\left\lvert{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(0),\ldots,X_{i_{\ell}}(0)\}\right\rvert\leq C_{\ell}.

We shall consider the case when there exists $j\in\{0,1,\ldots,K\}$ such that $s_{j}\neq 0$ . Let $(s_{(1)},\ldots,s_{(\ell)})$ be the order statistic of $(0,s_{2},\ldots,s_{\ell})$ and $j^{*}$ be an index corresponding to the maximum lag of the order statistic, i.e., $j^{*}:=\operatorname{\textrm{argmax}}_{j\in\{0,1,\ldots,K-1\}}(s_{(j+1)}-s_{(j)})$ . Recall the inequalities that

(i)

for $x_{1},\ldots,x_{p},y_{1}\ldots,y_{p}\in\mathbb{R}$ whose absolute values are of bounded by $C_{\rm ineq}$ ,

\displaystyle\left\lvert\prod_{j=1}^{p}x_{j}-\prod_{j=1}^{p}y_{j}\right\rvert\leq C_{\rm ineq}^{p-1}\sum_{j=1}^{p}\left\lvert x_{j}-y_{j}\right\rvert

(see, e.g., Panaretos and Tavakoli, 2013b , Lemma F.5),

(ii)

for any $\delta>1,s_{1},s_{2}\in\mathbb{Z}$ , and the alpha-mixing stationary processes $W_{1}(s_{1})$ and $W_{2}(s_{2})$ with $\lceil\frac{2\delta}{\delta-1}\rceil$ -th moment,

\displaystyle\left\lvert{\rm Cov}(W_{1}(s_{1}),W_{2}(s_{2}))\right\rvert\leq 8\alpha^{\frac{1}{\delta}}(|s_{2}-s_{1}|)\left({\rm E}W_{1}^{\frac{2\delta}{\delta-1}}(s_{1})\right)^{\frac{\delta-1}{2\delta}}\left({\rm E}W_{2}^{\frac{2\delta}{\delta-1}}(s_{2})\right)^{\frac{\delta-1}{2\delta}},

where $\alpha(\cdot)$ is the alpha-mixing coefficient. See, e.g., Doukhan, (1994, Theorem 3 (i), p.9).

From the above inequalities (i) and (ii), it follows, for appropriate indexes $i_{(1)},\ldots,i_{(\ell)}\in\{0,1,\ldots,K\}$ and the sets $\upsilon:=\{j\in{1,\ldots,\ell}:s_{(j)}\leq s_{(j^{*})}\}$ and $\upsilon^{\mathsf{c}}:=\{j\in{1,\ldots,\ell}:s_{(j)}>s_{(j^{*})}\}$ ,

		$\displaystyle\left\lvert{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}\right\rvert$
	$\displaystyle=$	$\displaystyle\left\lvert{\rm cum}\{X_{i_{(1)}}(s_{(1)}),X_{i_{(2)}}(s_{(2)}),\ldots,X_{i_{(\ell)}}(s_{(\ell)})\}\right\rvert$
		$\displaystyle-\left\lvert{\rm cum}\{X_{i_{(1)}}(s_{(1)}),X_{i_{(2)}}(s_{(2)}),\ldots,X_{i_{(j^{})}}(s_{(j^{})}),X^{}_{i_{(j^{}+1)}}(s_{(j^{}+1)}),\ldots,X^{}_{i_{(\ell)}}(s_{(\ell)})\}\right\rvert$
	$\displaystyle\leq$	$\displaystyle\Bigg{\|}\sum_{(\nu_{1},\ldots,\nu_{p})}(-1)^{p-1}(p-1)!\Bigg{\{}\left({\rm E}\prod_{j\in\nu_{1}}X_{i_{(j)}}(s_{(j)})\right)\ldots\left({\rm E}\prod_{j\in\nu_{p}}X_{i_{(j)}}(s_{(j)})\right)$
		$\displaystyle-\left({\rm E}\prod_{j\in\nu_{1}\cap\upsilon}X_{i_{(j)}}(s_{(j)})\right)\left({\rm E}\prod_{j\in\nu_{1}\cap\upsilon^{\mathsf{c}}}X_{i_{(j)}}^{*}(s_{(j)})\right)\ldots\left({\rm E}\prod_{j\in\nu_{p}\cap\upsilon}X_{i_{(j)}}(s_{(j)})\right)$
		$\displaystyle\quad\times\left({\rm E}\prod_{j\in\nu_{p}\cap\upsilon^{\mathsf{c}}}X_{i_{(j)}}^{*}(s_{(j)})\right)\Bigg{\}}\Bigg{\|}$
	$\displaystyle\leq$	$\displaystyle\sum_{(\nu_{1},\ldots,\nu_{p})}(p-1)!C_{\rm ineq,\ell}^{p-1}$
		$\displaystyle\quad\times\sum_{a=1}^{p}\left\lvert\left({\rm E}\prod_{j\in\nu_{a}}X_{i_{(j)}}(s_{(j)})\right)-\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon}X_{i_{(j)}}(s_{(j)})\right)\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon^{\mathsf{c}}}X_{i_{(j)}}^{*}(s_{(j)}\right)\right\rvert$
	$\displaystyle\leq$	$\displaystyle 8\alpha^{\frac{1}{2}}(s_{(j^{}+1)}-s_{(j^{})})\sum_{(\nu_{1},\ldots,\nu_{p})}(p-1)!C_{\rm ineq,\ell}^{p-1}$
		$\displaystyle\quad\times\sum_{a=1}^{p}\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon}X^{4}_{i_{(j)}}(s_{(j)})\right)^{\frac{1}{4}}\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon^{\mathsf{c}}}X^{4}_{i_{(j)}}(s_{(j)})\right)^{\frac{1}{4}}$
	$\displaystyle\leq$	$\displaystyle C^{\prime}_{\ell}\rho^{\frac{s_{(j^{}+1)}-s_{(j^{})}}{2}}$
	$\displaystyle\leq$	$\displaystyle C^{\prime}_{\ell}\rho^{\frac{\sum_{j=1}^{\ell-1}(s_{(j+1)}-s_{(j)})}{2(\ell-1)}}$

where $X_{i}^{*}(s)$ denotes an independent copy of $X_{i}(s)$ , the summation $\sum_{(\nu_{1},\ldots,\nu_{p})}$ extends over all partitions $(\nu_{1},\ldots,\nu_{p})$ of $\{1,2,\cdots,\ell\}$ ,

\displaystyle C_{\rm ineq,\ell}:=\sup_{\begin{subarray}{c}j\in\{1,\ldots,\ell\}\\ i_{1},\ldots,i_{j}\in\{1,\ldots,K\}\\ (s_{2},\ldots,s_{j})\in\mathbb{Z}\end{subarray}}\left\lvert{\rm E}X_{i_{1}}(0)X_{i_{2}}(s_{2})\cdots X_{i_{j}}(s_{j})\right\rvert,

and

	$\displaystyle C^{\prime}_{\ell}:=8C_{\alpha}\sup_{\begin{subarray}{c}i_{1},\ldots,i_{\ell}\in\{1,\ldots,K\}\\ (s_{2},\ldots,s_{\ell})\in\mathbb{Z}\end{subarray}}\Bigg{\|}\sum_{(\nu_{1},\ldots,\nu_{p})}(p-1)!C_{\rm ineq,\ell}^{p-1}$
	$\displaystyle\times\sum_{a=1}^{p}\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon}X^{4}_{i_{(j)}}(s_{(j)})\right)^{\frac{1}{4}}\left({\rm E}\prod_{j\in\nu_{a}\cap\upsilon^{\mathsf{c}}}X^{4}_{i_{(j)}}(s_{(j)})\right)^{\frac{1}{4}}\Bigg{\|}$

with the convention that ${\rm E}\left(\prod_{j\in\emptyset}X_{i_{(j)}}(s_{(j)})\right)=1$ . Therefore, for each $\ell\in\mathbb{N}$ , there exist constants $C_{\ell}^{\prime\prime}\geq 1$ and $\rho^{\prime}\in(0,1)$ such that

\displaystyle\left\lvert{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}\right\rvert\leq C_{\ell}^{\prime\prime}\left(\rho^{\prime}\right)^{\sum_{j=1}^{\ell-1}(s_{(j+1)}-s_{(j)})}

(13)

for any $s_{2},\ldots,s_{\ell}\in\mathbb{Z}$ and any $i_{1},\ldots,i_{\ell}\in\{1,\ldots,K\}$ .

Next, we show the summability condition for the cumulant (3) by using (13). Simple algebra gives

		$\displaystyle\sum_{s_{2},\ldots,s_{\ell}=-\infty}^{\infty}\left(1+\sum_{j=2}^{\ell}\left\lvert s_{j}\right\rvert^{d}\right)\left\lvert{\rm cum}\{X_{i_{1}}(0),X_{i_{2}}(s_{2}),\ldots,X_{i_{\ell}}(s_{\ell})\}\right\rvert$
	$\displaystyle\leq$	$\displaystyle C_{\ell}^{\prime\prime}\sum_{s_{2},\ldots,s_{\ell}=-\infty}^{\infty}\left(1+\sum_{j=2}^{\ell}\left\lvert s_{j}\right\rvert^{d}\right)\left(\rho^{\prime}\right)^{\sum_{j=1}^{\ell-1}(s_{(j+1)}-s_{(j)})}$
	$\displaystyle\leq$	$\displaystyle C_{\ell}^{\prime\prime}(\ell-1)!\sum_{I}\left(1+\sum_{j=2}^{\ell}\left\lvert s_{j}\right\rvert^{d}\right)\left(\rho^{\prime}\right)^{\sum_{j=1}^{\ell-1}(s_{(j+1)}-s_{(j)})}$
	$\displaystyle\leq$	$\displaystyle C_{\ell}^{\prime\prime}(\ell-1)!\Bigg{(}\sum_{s_{\ell}=-\infty}^{0}\sum_{s_{\ell-1}=s_{\ell}}^{0}\sum_{s_{\ell-2}=s_{\ell-1}}^{0}\dots\sum_{s_{3}=s_{4}}^{0}\sum_{s_{2}=s_{3}}^{0}\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{-s_{\ell}}$
		$\displaystyle+\sum_{\tau=3}^{\ell}\sum_{s_{\ell}=-\infty}^{0}\sum_{s_{\ell-1}=s_{\ell}}^{0}\sum_{s_{\ell-2}=s_{\ell-1}}^{0}\dots\sum_{s_{\tau+1}=s_{\tau+2}}^{0}\sum_{s_{\tau}=s_{\tau+1}}^{0}\sum_{s_{2}=0}^{\infty}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{3}}\dots\sum_{s_{\tau-2}=0}^{s_{\tau-3}}\sum_{s_{\tau-1}=0}^{s_{\tau-2}}$
		$\displaystyle\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{s_{2}-s_{\ell}}$
		$\displaystyle+\sum_{s_{2}=0}^{\infty}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{3}}\dots\sum_{s_{\ell-1}=0}^{s_{\ell-2}}\sum_{s_{\ell}=0}^{s_{\ell-1}}\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{s_{2}}\Bigg{)}$
	$\displaystyle=$	$\displaystyle C_{\ell}^{\prime\prime}(\ell-1)!(L_{1}+L_{2}+L_{3})$

where $\sum_{I}$ runs over all integers $s_{2},\ldots,s_{\ell}$ such that $-\infty<s_{\ell}\leq s_{\ell-1}\leq\ldots\leq s_{2}<\infty$ ,

	$\displaystyle L_{1}:=$	$\displaystyle\sum_{s_{\ell}=-\infty}^{0}\sum_{s_{\ell-1}=s_{\ell}}^{0}\sum_{s_{\ell-2}=s_{\ell-1}}^{0}\dots\sum_{s_{3}=s_{4}}^{0}\sum_{s_{2}=s_{3}}^{0}\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{-s_{\ell}},$
	$\displaystyle L_{2}:=$	$\displaystyle\sum_{\tau=3}^{\ell}\sum_{s_{\ell}=-\infty}^{0}\sum_{s_{\ell-1}=s_{\ell}}^{0}\sum_{s_{\ell-2}=s_{\ell-1}}^{0}\dots\sum_{s_{\tau+1}=s_{\tau+2}}^{0}\sum_{s_{\tau}=s_{\tau+1}}^{0}\sum_{s_{2}=0}^{\infty}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{3}}\dots\sum_{s_{\tau-2}=0}^{s_{\tau-3}}\sum_{s_{\tau-1}=0}^{s_{\tau-2}}$
		$\displaystyle\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{s_{2}-s_{\ell}},$
	$\displaystyle\text{and }L_{3}:=$	$\displaystyle\sum_{s_{2}=0}^{\infty}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{3}}\dots\sum_{s_{\ell-1}=0}^{s_{\ell-2}}\sum_{s_{\ell}=0}^{s_{\ell-1}}\left(\sum_{j=2}^{\ell}\|s_{j}\|^{d}\right)\left(\rho^{\prime}\right)^{s_{2}}.$

We only prove $L_{3}<\infty$ , as the proofs of $L_{1},L_{2}<\infty$ is similar. It can be seen that

	$\displaystyle L_{3}\leq$	$\displaystyle(\ell-1)\sum_{s_{2}=0}^{\infty}\|s_{2}\|^{d}\left(\rho^{\prime}\right)^{s_{2}}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{3}}\dots\sum_{s_{\ell-1}=0}^{s_{\ell-2}}\sum_{s_{\ell}=0}^{s_{\ell-1}}1$
	$\displaystyle\leq$	$\displaystyle(\ell-1)\sum_{s_{2}=0}^{\infty}\|s_{2}\|^{d}\left(\rho^{\prime}\right)^{s_{2}}\sum_{s_{3}=0}^{s_{2}}\sum_{s_{4}=0}^{s_{2}}\dots\sum_{s_{\ell-1}=0}^{s_{2}}\sum_{s_{\ell}=0}^{s_{2}}1$
	$\displaystyle\leq$	$\displaystyle(\ell-1)\sum_{s_{2}=0}^{\infty}\|s_{2}\|^{d+\ell-2}\left(\rho^{\prime}\right)^{s_{2}}$

which is finite. ∎

A.2 Proof of Theorem 2.1

First, we show the uniqueness of the processes $G_{1}$ and $G_{2}$ . Suppose that there exists the processes $\{G_{d}(t)\}$ for $d=1,\ldots,K$ , which taking the form of

\displaystyle G_{d}(t):=\sum_{j=1}^{d}\sum_{k_{j}=-\infty}^{\infty}a_{dj}(k_{j})X_{j}(t-k_{j})

such that $X_{0}(t)=\zeta+\sum_{d=1}^{K}G_{d}(t)+\epsilon(t)$ , $G_{i}$ and $G_{j}$ are orthogonal for any $i,j(\neq i)\in\{1,\ldots,K\}$ , i.e., ${\rm E}G_{i}(t)G_{j}(t^{\prime})=0$ for any $t,t^{\prime}\in\mathbb{Z}$ , $\sum_{k=-\infty}^{\infty}|a_{dj}(k)|<\infty$ for any $d\in\{1,\ldots,K\}$ and $j\in\{1,\ldots,d\}$ , and the transfer function $A_{dd}\left(e^{-\mathrm{i}\lambda}\right):=\sum_{k=-\infty}^{\infty}a_{dd}(k)e^{-\mathrm{i}k\lambda}$ satisfies $A_{dd}\left(e^{-\mathrm{i}\lambda}\right)\neq 0$ for all $\lambda\in[-\pi,\pi]$ and $d\in\{1,\ldots,K\}$ . From the orthogonality ${\rm E}G_{d}(t)G_{d^{\prime}}(t^{\prime})=0$ for $d\in\{2,\ldots,K\}$ , $d^{\prime}\in\{1,\ldots,d-1\}$ , and all $t,t^{\prime}\in\mathbb{Z}$ , it holds that

	$\displaystyle f_{G_{d}G_{d^{\prime}}}(\lambda)=0$
$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}\sum_{j^{\prime}=1}^{d^{\prime}}A_{dj}\left(e^{-i\lambda}\right)A_{d^{\prime}j^{\prime}}\left(e^{i\lambda}\right)f_{jj^{\prime}}(\lambda)=0$
$\displaystyle\Leftrightarrow$	$\displaystyle A_{d^{\prime}d^{\prime}}\left(e^{i\lambda}\right)\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd^{\prime}}(\lambda)=0$
$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd^{\prime}}(\lambda)=0,\quad$	(14)

and thus

\displaystyle\bm{f}_{d-1}^{\top}(\lambda){\bm{A}_{d}^{\flat}}\left(e^{-i\lambda}\right)=-A_{dd}\left(e^{-i\lambda}\right)\overline{\bm{f}_{d-1,d}^{\flat}(\lambda)},

where

\displaystyle{\bm{A}_{d}^{\flat}}\left(e^{-i\lambda}\right):=\left({A}_{d1}\left(e^{-i\lambda}\right),\ldots,{A}_{d(d-1)}\left(e^{-i\lambda}\right)\right)^{\top}\quad\text{and }\bm{f}_{a,b}^{\flat}:=

\displaystyle({f}_{1b}(\lambda),\ldots,{f}_{ab}(\lambda))^{\top}.

Since $\bm{f}_{K}(\lambda)$ is positive definite if and only if all principal sub-matrices is positive definite, we obtain

\displaystyle{\bm{A}_{d}^{\flat}}\left(e^{-i\lambda}\right)=-A_{dd}\left(e^{-i\lambda}\right){\bm{f}_{d-1}^{\top}}^{-1}(\lambda)\overline{\bm{f}_{d-1,d}^{\flat}(\lambda)},

(15)

or equivalently by Cramer’s rule,

\displaystyle A_{dj}\left(e^{-i\lambda}\right)=-A_{dd}\left(e^{-i\lambda}\right)\frac{{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)}{{\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}\quad\text{for $j=1,\ldots,d-1$},

(16)

where

\displaystyle{\bm{f}}_{j,d}^{\ddagger}(\lambda):=\left({\bm{f}_{d-1,1}^{\flat}(\lambda)},\ldots,{\bm{f}_{d-1,j-1}^{\flat}(\lambda)},{\bm{f}_{d-1,d}^{\flat}(\lambda)},{\bm{f}_{d-1,j+1}^{\flat}(\lambda)},\ldots,{\bm{f}_{d-1,d-1}^{\flat}(\lambda)}\right).

From (15), (16), and

\displaystyle\bm{f}_{d}(\lambda)=\begin{pmatrix}\bm{f}_{d-1}(\lambda)&\bm{f}_{d-1,d}^{\flat}(\lambda)\\ {\bm{f}_{d-1,d}^{\flat}}^{*}(\lambda)&{f}_{dd}(\lambda)\\ \end{pmatrix},

simple algebra yields

$\displaystyle{\rm det}\left(\bm{f}_{d}(\lambda)\right)=$	$\displaystyle{\rm det}\left(\bm{f}_{d-1}(\lambda)\right){\rm det}\left({f}_{dd}(\lambda)-{\bm{f}_{d-1,d}^{\flat}}^{*}(\lambda){\bm{f}_{d-1}^{-1}}(\lambda)\bm{f}_{d-1,d}^{\flat}(\lambda)\right)$
$\displaystyle=$	$\displaystyle{\rm det}\left(\bm{f}_{d-1}(\lambda)\right){\rm det}\left({f}_{dd}(\lambda)-{\bm{f}_{d-1,d}^{\flat}}^{\top}(\lambda){{\bm{f}_{d-1}^{\top}}^{-1}}(\lambda)\overline{\bm{f}_{d-1,d}^{\flat}(\lambda)}\right)$
$\displaystyle=$	$\displaystyle{\rm det}\left(\bm{f}_{d-1}(\lambda)\right){\rm det}\left({f}_{dd}(\lambda)+\left(A_{dd}\left(e^{-i\lambda}\right)\right)^{-1}{\bm{f}_{d-1,d}^{\flat}}^{\top}(\lambda){\bm{A}_{d}^{\flat}}\left(e^{-i\lambda}\right)\right)$
$\displaystyle=$	$\displaystyle{\rm det}\left(\bm{f}_{d-1}(\lambda)\right){\rm det}\left({f}_{dd}(\lambda)+\left(A_{dd}\left(e^{-i\lambda}\right)\right)^{-1}\sum_{j=1}^{d-1}{f}_{jd}(\lambda){A}_{dj}\left(e^{-i\lambda}\right)\right)$
$\displaystyle=$	$\displaystyle{\rm det}\left(\bm{f}_{d-1}(\lambda)\right){\rm det}\left({f}_{dd}(\lambda)-\sum_{j=1}^{d-1}{f}_{jd}(\lambda)\frac{{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)}{{\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}\right)$
$\displaystyle=$	$\displaystyle{f}_{dd}(\lambda){\rm det}\left(\bm{f}_{d-1}(\lambda)\right)-\sum_{j=1}^{d-1}{f}_{jd}(\lambda){\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right).$	(17)

By the orthogonality ${\rm E}G_{d}(t)\overline{X_{0}(t^{\prime})}={\rm E}G_{d}(t)\overline{G_{d}(t^{\prime})}$ for $d\in\{2,\ldots,K\}$ and all $t,t^{\prime}\in\mathbb{Z}$ , (14), (16), (17), we obtain

		$\displaystyle f_{G_{d}0}(\lambda)=f_{G_{d}G_{d}}(\lambda)$
	$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{j0}(\lambda)=\sum_{j,j^{\prime}=1}^{d}A_{dj}\left(e^{-i\lambda}\right)A_{dj^{\prime}}\left(e^{i\lambda}\right)f_{jj^{\prime}}(\lambda)$
	$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{j0}(\lambda)=\sum_{j^{\prime}=1}^{d}A_{dj^{\prime}}\left(e^{i\lambda}\right)\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jj^{\prime}}(\lambda)$
	$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{j0}(\lambda)=A_{dd}\left(e^{i\lambda}\right)\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd}(\lambda)$
	$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d-1}A_{dj}\left(e^{-i\lambda}\right)f_{j0}(\lambda)+A_{dd}\left(e^{-i\lambda}\right)f_{d0}(\lambda)$
		$\displaystyle=A_{dd}\left(e^{i\lambda}\right)\left(\sum_{j=1}^{d-1}A_{dj}\left(e^{-i\lambda}\right)f_{jd}(\lambda)+A_{dd}\left(e^{-i\lambda}\right)f_{dd}(\lambda)\right)$
	$\displaystyle\Leftrightarrow$	$\displaystyle-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{j0}(\lambda)+f_{d0}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)$
		$\displaystyle=A_{dd}\left(e^{i\lambda}\right)\left(-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{jd}(\lambda)+f_{dd}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)\right)$
	$\displaystyle\Leftrightarrow$	$\displaystyle A_{dd}\left(e^{i\lambda}\right)=\frac{-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{j0}(\lambda)+f_{d0}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}{-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{jd}(\lambda)+f_{dd}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}$
	$\displaystyle\Leftrightarrow$	$\displaystyle A_{dd}\left(e^{i\lambda}\right)=\frac{-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{j0}(\lambda)+f_{d0}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}{{\rm det}\left(\bm{f}_{d}(\lambda)\right)}$

which also gives, for $d=2,\ldots,K$ and $d^{\prime}\in\{1,\ldots,d-1\}$ ,

	$\displaystyle A_{dd^{\prime}}\left(e^{i\lambda}\right)=$	$\displaystyle-A_{dd}\left(e^{i\lambda}\right)\frac{{\rm det}\left(\bm{f}_{d^{\prime},d}^{\ddagger}(\lambda)\right)}{{\rm det}\left(\bm{f}_{d-1}(\lambda)\right)}$
	$\displaystyle\text{ and }f_{G_{d}G_{d}}(\lambda)=$	$\displaystyle A_{dd}\left(e^{i\lambda}\right)\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd}(\lambda)$
	$\displaystyle=$	$\displaystyle A_{dd}\left(e^{i\lambda}\right)\left(\sum_{j=1}^{d-1}A_{dj}\left(e^{-i\lambda}\right)f_{jd}(\lambda)+A_{dd}\left(e^{-i\lambda}\right)f_{dd}(\lambda)\right)$
	$\displaystyle=$	$\displaystyle\frac{\left\|A_{dd}\left(e^{i\lambda}\right)\right\|^{2}}{{\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)}\left(-\sum_{j=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{j,d}^{\ddagger}(\lambda)}\right)f_{jd}(\lambda)+f_{dd}(\lambda){\rm det}\left(\bm{f}_{d-1}^{\top}(\lambda)\right)\right)$
	$\displaystyle=$	$\displaystyle\frac{\left\|A_{dd}\left(e^{i\lambda}\right)\right\|^{2}}{{\rm det}\left(\bm{f}_{d-1}(\lambda)\right)}{\rm det}\left(\bm{f}_{d}(\lambda)\right).$

For the case $d=1$ , we use the orthogonality ${\rm E}G_{1}(t)\overline{X_{0}(t^{\prime})}={\rm E}G_{1}(t)\overline{G_{1}(t^{\prime})}$ for all $t,t^{\prime}\in\mathbb{Z}$ and have

	$\displaystyle f_{G_{1}0}(\lambda)=f_{G_{1}G_{1}}(\lambda)\Leftrightarrow$	$\displaystyle A_{11}\left(e^{-i\lambda}\right)f_{10}(\lambda)=\left\|A_{11}\left(e^{-i\lambda}\right)\right\|^{2}f_{11}(\lambda)$
	$\displaystyle\Leftrightarrow$	$\displaystyle A_{11}\left(e^{i\lambda}\right)=\frac{f_{10}(\lambda)}{f_{11}(\lambda)}$

and

\displaystyle f_{G_{1}G_{1}}(\lambda)=\frac{\left|f_{10}(\lambda)\right|^{2}}{f_{11}(\lambda)}.

To complete the proof, we need show that, $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ for all $\lambda\in[-\pi,\pi]$ if and only if

\displaystyle\frac{-\sum_{i=1}^{K-1}{\rm det}\left(\overline{\bm{f}_{i,K}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{K-1}(\lambda)\right)f_{K0}(\lambda)}{{\rm det}\left(\bm{f}_{K}(\lambda)\right)}=0\quad\text{for all $\lambda\in[-\pi,\pi]$}.

(18)

First, we prove that $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ for all $\lambda\in[-\pi,\pi]$ is necessary for

\frac{-\sum_{i=1}^{K-1}{\rm det}\left(\overline{\bm{f}_{i,K}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{K-1}(\lambda)\right)f_{K0}(\lambda)}{{\rm det}\left(\bm{f}_{K}(\lambda)\right)}=0\quad\text{for all $\lambda\in[-\pi,\pi]$}

by contraposition. Suppose $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)\neq 0$ for some $\lambda\in[-\pi,\pi]$ , then from the above discussion we have

A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=\frac{-\sum_{i=1}^{K-1}{\rm det}\left(\overline{\bm{f}_{i,K}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{K-1}(\lambda)\right)f_{K0}(\lambda)}{{\rm det}\left(\bm{f}_{K}(\lambda)\right)}

which is not zero for some $\lambda\in[-\pi,\pi]$ . Next, we show $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ for all $\lambda\in[-\pi,\pi]$ is sufficient for

\frac{-\sum_{i=1}^{K-1}{\rm det}\left(\overline{\bm{f}_{i,K}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{K-1}(\lambda)\right)f_{K0}(\lambda)}{{\rm det}\left(\bm{f}_{K}(\lambda)\right)}=0\quad\text{for all $\lambda\in[-\pi,\pi]$}.

The condition $A_{KK}\left(e^{-\mathrm{i}\lambda}\right)=0$ for all $\lambda\in[-\pi,\pi]$ implies that $A_{K,j}(e^{-i\lambda})=0$ for $j=1,\ldots,K-1$ and $G_{d}(t)=0$ for all $t\in\mathbb{Z}$ . Further, for $j=1,\dots,d$ , and any $t,t^{\prime}\in\mathbb{Z}$ , $EX_{0}(t)\overline{X_{j}(t^{\prime})}=\sum_{i=1}^{d-1}EG_{i}(t)\overline{X_{j}(t^{\prime})}$ yields

f_{0,j}(\lambda)=\sum_{i=1}^{d-1}\sum_{k=i}^{d-1}A_{k,i}(e^{-i\lambda})f_{i,j}(\lambda),

which gives that

		$\displaystyle-\sum_{j=1}^{d-1}\det(\bm{f}_{j,d}^{{\ddagger}}(\lambda))f_{0,j}(\lambda)+\det(\bm{f}_{d-1}(\lambda))f_{0,d}(\lambda)$
	$\displaystyle=$	$\displaystyle-\sum_{j=1}^{d-1}\det(\bm{f}_{j,d}^{{\ddagger}}(\lambda))\sum_{i=1}^{d-1}\sum_{k=i}^{d-1}A_{k,i}(e^{-i\lambda})f_{i,j}(\lambda)+\det(\bm{f}_{d-1}(\lambda))\sum_{i=1}^{d-1}\sum_{k=i}^{d-1}A_{k,i}(e^{-i\lambda})f_{i,d}(\lambda)$
	$\displaystyle=$	$\displaystyle\sum_{i=1}^{d-1}\sum_{k=i}^{d-1}A_{k,i}(e^{-i\lambda})\left(-\sum_{j=1}^{d-1}\det(\bm{f}_{j,d}^{{\ddagger}}(\lambda))f_{i,j}(\lambda)+\det(\bm{f}_{d-1}(\lambda))f_{i,d}(\lambda)\right).$

Therefore, it is sufficient to show

-\sum_{j=1}^{d-1}\det(\bm{f}_{j,d}^{{\ddagger}}(\lambda))f_{i,j}(\lambda)+\det(\bm{f}_{d-1}(\lambda))f_{i,d}(\lambda)=0

for $i=1,\dots,d-1$ . Denote the minor of matrix $\bm{A}$ with $i$ th row and $j$ th column eliminated as $\bm{f}^{(i,j)}$ and note that $\det(\bm{f}_{j,d}^{{\ddagger}(i,j)}(\lambda))=\det(\bm{f}_{d-1}^{(i,j)}(\lambda))$ . Then,

	$\displaystyle-\sum_{j=1}^{d-1}\det(\bm{f}_{j,d}^{{\ddagger}}(\lambda))f_{i,j}(\lambda)+\det(\bm{f}_{d-1}(\lambda))f_{i,d}(\lambda)$
$\displaystyle=$	$\displaystyle-\sum_{j=1}^{d-1}\left(\sum_{h=1}^{d-1}(-1)^{j+h}\det(\bm{f}_{j,d}^{{\ddagger}(h,j)}(\lambda))f_{h,d}(\lambda)\right)f_{i,j}(\lambda)$
	$\displaystyle+\left(\sum_{j=1}^{d-1}(-1)^{j+i}\det(\bm{f}_{d-1}^{(i,j)}(\lambda))f_{i,j}(\lambda)\right)f_{i,d}(\lambda)$
$\displaystyle=$	$\displaystyle-\sum_{h=1}^{d-1}\left(\sum_{j=1}^{d-1}(-1)^{j+h}\det(\bm{f}_{j,d}^{{\ddagger}(h,j)}(\lambda))f_{i,j}(\lambda)\right)f_{h,d}(\lambda)$
	$\displaystyle+\left(\sum_{j=1}^{d-1}(-1)^{j+i}\det(\bm{f}_{d-1}^{(i,j)}(\lambda))f_{i,j}(\lambda)\right)f_{i,d}(\lambda)$
$\displaystyle=$	$\displaystyle-\sum_{h=1,h\neq i}^{d-1}\left(\sum_{j=1}^{d-1}(-1)^{j+h}\det(\bm{f}_{j,d}^{{\ddagger}(h,j)}(\lambda))f_{i,j}(\lambda)\right)f_{h,d}(\lambda)$
$\displaystyle=$	$\displaystyle-\sum_{h=1,h\neq i}^{d-1}\left(\sum_{j=1}^{d-1}(-1)^{j+h}\det(\bm{f}_{d-1}^{(h,j)}(\lambda))f_{i,j}(\lambda)\right)f_{h,d}(\lambda).$	(19)

The determinant of the matrix

\displaystyle\begin{pmatrix}f_{11}&\cdots&f_{1(d-1)}\\ \vdots&&\\ f_{(h-1)1}&\cdots&f_{(h-1)(d-1)}\\ f_{i1}&\cdots&f_{i(d-1)}\\ f_{(h+1)1}&\cdots&f_{(h+1)(d-1)}\\ \vdots&&\\ f_{(d-1)1}&\cdots&f_{(d-1)(d-1)}\\ \end{pmatrix}

can be expanded as $\sum_{j=1}^{d-1}(-1)^{j+h}\det(\bm{f}_{d-1}^{(h,j)}(\lambda))f_{i,j}(\lambda)$ but is zero since the $i$ -th row and $h$ -th row of the matrix are the same, which yields that (19) is zero.

Second, we show the existence of the processes $\{G_{d}(t)\}$ . To this end, we shall confirm that $\{G_{d}(t)\}$ defined by (5) with $a_{dj}$ whose transfer function is given by (6) for $d=j=1$ , (7) for $d=j\in\{2,\ldots,K\}$ , (8) for $d\in\{2,\ldots,K\}$ and $j\in\{2,\ldots,d-1\}$ satisfies (i) $G_{i}$ and $G_{j}$ are orthogonal for any $i,j(\neq i)\in\{1,\ldots,K\}$ , (ii) $X_{0}(t)=\zeta+\sum_{j=1}^{K}G_{j}(t)+\epsilon(t)$ , (iii) $\sum_{k=-\infty}^{\infty}|a_{dj}(k)|<\infty$ for any $d\in\{1,\ldots,K\}$ and $j\in\{1,\ldots,d\}$ , and (iv) $A_{dd}\left(e^{-\mathrm{i}\lambda}\right)\neq 0$ for all $\lambda\in[-\pi,\pi]$ and $d\in\{1,\ldots,K\}$ .

For $d\leq K$ and $j\leq d-1$ , Cramer’s rule tells us that $A_{dj}$ satisfies

\displaystyle\bm{f}_{d-1}^{\top}(\lambda){\bm{A}_{d}^{\flat}}\left(e^{-i\lambda}\right)=-A_{dd}\left(e^{-i\lambda}\right)\overline{\bm{f}_{d-1,d}^{\flat}(\lambda)},

and thus we have, for $d^{\prime}=1,\ldots,d-1$ ,

		$\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd^{\prime}}(\lambda)=0$
	$\displaystyle\Leftrightarrow$	$\displaystyle A_{d^{\prime}d^{\prime}}\left(e^{i\lambda}\right)\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd^{\prime}}(\lambda)=0$
	$\displaystyle\Leftrightarrow$	$\displaystyle\sum_{j=1}^{d}\sum_{j^{\prime}=1}^{d^{\prime}}A_{dj}\left(e^{-i\lambda}\right)A_{d^{\prime}j^{\prime}}\left(e^{i\lambda}\right)f_{jj^{\prime}}(\lambda)=0$
	$\displaystyle\Leftrightarrow$	$\displaystyle f_{G_{d}G_{d^{\prime}}}(\lambda)=0,$

which shows (i).

In order to verify (ii), we show $B_{j}\left(e^{-i\lambda}\right)=\sum_{k=j}^{d}A_{k,j}\left(e^{-i\lambda}\right)$ . The orthogonality of $G_{d}$ ’s gives $\sum_{j=1}^{d}A_{dj}\left(e^{-i\lambda}\right)f_{jd^{\prime}}(\lambda)=0,$ which is equivalent to ${\rm E}G_{d}(t)\bar{Y}_{d^{\prime}}(t^{\prime})=0$ for any $t,t^{\prime}\in\mathbb{Z}$ , $d\leq K$ , and $d^{\prime}=1,\ldots,d-1$ . From ${\rm E}Y_{0}(t)\bar{G}_{d}(t^{\prime})={\rm E}G_{d}(t)\bar{G}_{d}(t^{\prime})$ , it holds that

\displaystyle\sum_{j=1}^{d}A_{dj}\left(e^{i\lambda}\right)f_{0j}(\lambda)=

\displaystyle B_{d}\left(e^{-i\lambda}\right)\sum_{j=1}^{d}A_{dj}(e^{i\lambda})f_{dj}(\lambda).

(20)

By noting that ${\rm det}\left(\bm{f}_{d}(\lambda)\right)=-\sum_{i=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{i,d}^{\ddagger}(\lambda)}\right)f_{id}(\lambda)+{\rm det}\left(\bm{f}_{d-1}(\lambda)\right)f_{dd}(\lambda)$ , (20) yields that

	$\displaystyle B_{d}\left(e^{-i\lambda}\right):=$	$\displaystyle\frac{\sum_{j=1}^{d}A_{dj}\left(e^{i\lambda}\right)f_{0j}(\lambda)}{\sum_{j=1}^{d}A_{dj}(e^{i\lambda})f_{dj}(\lambda)}$
	$\displaystyle=$	$\displaystyle\frac{-\sum_{i=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{i,d}^{\ddagger}(\lambda)}\right)f_{i0}(\lambda)+{\rm det}\left(\bm{f}_{d-1}(\lambda)\right)f_{d0}(\lambda)}{-\sum_{i=1}^{d-1}{\rm det}\left(\overline{\bm{f}_{i,d}^{\ddagger}(\lambda)}\right)f_{id}(\lambda)+{\rm det}\left(\bm{f}_{d-1}(\lambda)\right)f_{dd}(\lambda)}$
	$\displaystyle=$	$\displaystyle A_{dd}\left(e^{-i\lambda}\right).$

Consider ${\rm E}Y_{0}(t)\bar{G}_{d-1}(t^{\prime})={\rm E}G_{d-1}(t)\bar{G}_{d-1}(t^{\prime})$ , then we obtain

		$\displaystyle\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{0j}(\lambda)$
	$\displaystyle=$	$\displaystyle B_{d-1}\left(e^{-i\lambda}\right)\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{d-1j}(\lambda)+A_{dd}\left(e^{-i\lambda}\right)\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{dj}(\lambda).$

Somewhat lengthy calculations gives

		$\displaystyle B_{d-1}\left(e^{-i\lambda}\right)$
	$\displaystyle=$	$\displaystyle\frac{\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{0j}(\lambda)}{\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{(d-1)j}(\lambda)}-A_{dd}\left(e^{-i\lambda}\right)\frac{\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{dj}(\lambda)}{\sum_{j=1}^{d-1}A_{(d-1)j}\left(e^{i\lambda}\right)f_{(d-1)j}(\lambda)}$
	$\displaystyle=$	$\displaystyle A_{d-1,d-1}\left(e^{-i\lambda}\right)+A_{d,d-1}\left(e^{-i\lambda}\right).$

In the same manner, ${\rm E}(Y_{0}(t)-G_{d}(t))\bar{G}_{d-1}(t^{\prime})$ and ${\rm E}(Y_{0}(t)-G_{d}(t))\bar{G}_{d-2}(t^{\prime})$ imply

	$\displaystyle B_{d-1}^{*}\left(e^{-i\lambda}\right)=$	$\displaystyle A_{(d-1)(d-1)}\left(e^{-i\lambda}\right)$
	$\displaystyle B_{d-2}^{*}\left(e^{-i\lambda}\right)=$	$\displaystyle A_{(d-2)(d-2)}\left(e^{-i\lambda}\right)+A_{(d-1)(d-2)}\left(e^{-i\lambda}\right),$

respectively, where $B_{j}^{*}\left(e^{-i\lambda}\right):=B_{j}\left(e^{-i\lambda}\right)-A_{dj}\left(e^{-i\lambda}\right)$ . Moreover, ${\rm E}(Y_{0}(t)-G_{d}(t)-G_{d-1}(t))\bar{G}_{d-2}(t^{\prime})$ and ${\rm E}(Y_{0}(t)-G_{d}(t)-G_{d-1}(t))\bar{G}_{d-3}(t^{\prime})$ imply

	$\displaystyle B_{d-1}^{**}\left(e^{-i\lambda}\right)=$	$\displaystyle A_{(d-2)(d-2)}\left(e^{-i\lambda}\right)$
	$\displaystyle B_{d-2}^{**}\left(e^{-i\lambda}\right)=$	$\displaystyle A_{(d-3)(d-3)}\left(e^{-i\lambda}\right)+A_{(d-2)(d-3)}\left(e^{-i\lambda}\right),$

respectively, where $B_{j}^{**}\left(e^{-i\lambda}\right):=B_{j}^{*}\left(e^{-i\lambda}\right)-A_{(d-1)j}\left(e^{-i\lambda}\right)$ . By repeating the argument, we obtain $B_{j}\left(e^{-i\lambda}\right)=\sum_{k=j}^{d}A_{k,j}\left(e^{-i\lambda}\right)$ .

The summability of linear filters (iii) follows from the twice continuously differentiability of $A_{ij}$ (see Katznelson, 2004, Section 4.4, p.25).

The condition (iv) follows directly from the assumption of Theorem 2.1. ∎

A.3 Proof of Theorem 3.1

Since $T_{n}$ and $\hat{\mu}_{n,K}$ are the functional of $\hat{\bm{f}}(\lambda)$ , write $T_{n}$ and $\hat{\mu}_{n,K}$ as $T\left(\hat{\bm{f}}(\lambda)\right)$ and $\mu\left(\hat{\bm{f}}(\lambda)\right)$ , respectively, First, we show that the effect on the estimation of ${\rm E}X_{i}^{\prime}$ in the kernel density estimator is asymptotically negligible, that is,

\displaystyle T\left(\hat{\bm{f}}(\lambda)\right)-T\left(\hat{\hat{\bm{f}}}(\lambda)\right)=o_{p}(1)\quad n\to\infty,

(21)

where

\displaystyle\hat{\hat{f}}_{ij}(\lambda):=\frac{1}{2\pi}\sum_{h=1-n}^{n-1}\omega\left(\frac{h}{M_{n}}\right)\hat{\hat{{\gamma}}}_{ij}(h)e^{-\mathrm{i}h\lambda}

with, for $h\in\{0,\ldots,n-1\}$ ,

\hat{\hat{\gamma}}_{ij}(h):=\frac{1}{n-h}\sum_{t=1}^{n-h}(X_{i}{(t+h)}-{\rm E}X_{i}{(t+h)})({X}_{j}({t})-{\rm E}X_{j}{(t)}),

for $h\in\{-n+1,\ldots,-1\}$ ,

\hat{\hat{\gamma}}_{ij}(h):=\frac{1}{n+{h}}\sum_{t=-h+1}^{n}(X_{i}{(t+h)}-{\rm E}X_{i}{(t+h)})({X}_{j}({t})-{\rm E}X_{j}{(t)}).

From Assumption 3.1 (A2) and the fact that $\left|\hat{\hat{\gamma}}_{ij}(h)-{\hat{\gamma}}_{ij}(h)\right|=O_{p}(1/n)$ uniformly in $h\in\{-n+1,\ldots,n-1\}$ (see, e.g., Politis, 2011, Remark 3.1), we obtain

	$\displaystyle\sup_{\lambda\in[-\pi,\pi]}\left\|{\hat{f}}_{ij}(\lambda)-\hat{\hat{f}}_{ij}(\lambda)\right\|\leq$	$\displaystyle\frac{1}{2\pi}\sum_{h=1-n}^{n-1}\omega\left(\frac{h}{M_{n}}\right)\left\|\hat{\hat{\gamma}}_{ij}(h)-{\hat{\gamma}}_{ij}(h)\right\|$
	$\displaystyle=$	$\displaystyle O_{p}\left(\frac{M_{n}}{n}\right).$		(22)

Then, Lipschitz continuity of $\mu$ gives

		$\displaystyle{\rm P}\left(M_{n}\left\|\mu\left(\hat{\bm{f}}(\lambda)\right)-\mu\left(\hat{\hat{\bm{f}}}{(\lambda)}\right)\right\|>\epsilon\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(M_{n}L_{\mu}\int_{-\pi}^{\pi}\left\\|\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right\\|_{2}{\rm d}\lambda>\epsilon,\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}\leq\delta\right)$
		$\displaystyle+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(M_{n}L_{\mu}\int_{-\pi}^{\pi}\left\\|\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right\\|_{2}{\rm d}\lambda>\epsilon\right)+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right),$

where $L_{\mu}$ is the Lipschitz constant. Robinson, (1991, Theorem 2.1) and (22) yield that

\displaystyle M_{n}\left(\mu\left(\hat{\bm{f}}(\lambda)\right)-\mu\left(\hat{\hat{\bm{f}}}{(\lambda)}\right)\right)=o_{p}(1)\quad\text{as $n\to\infty$.}

(23)

In addition, Theorems 5.9.1 and 7.4.1–7.4.4 of Brillinger, (1981) yield that

	$\displaystyle{\hat{f}}_{ij}(\lambda)-{\check{f}}_{ij}(\lambda)=$	$\displaystyle O_{p}\left(\frac{M_{n}^{2}}{n}+\frac{M_{n}}{n}\log n\right)\quad\text{uniformly in $\lambda\in[-\pi,\pi]$}$
	$\displaystyle\text{and }{\check{f}}_{ij}(\lambda)-{f}_{ij}(\lambda)=$	$\displaystyle O_{p}\left(\sqrt{\frac{M_{n}}{n}}+\frac{1}{M_{n}^{2}}\right)\quad\text{uniformly in $\lambda\in[-\pi,\pi]$},$

where ${\check{f}}_{ij}(\lambda)$ is the discretized version of the smoothed periodogram defined as

\displaystyle{\check{f}}_{ij}(\lambda):=\frac{2\pi}{n-u}\sum_{s=1}^{n-u-1}W^{(n)}\left(\lambda-\frac{2\pi s}{n-u}\right)I_{ij}\left(\frac{2\pi s}{n-u}\right)

with $W^{(n)}(\lambda):=M_{n}\sum_{j=-\infty}^{\infty}W\left(M_{n}(\lambda+2\pi j)\right)$ , and

I_{ij}(\lambda):=\frac{1}{2\pi n}\left(\sum_{t=1}^{n}Y_{i}(t)e^{-\mathrm{i}\lambda t}\right)\left(\sum_{t=1}^{n}Y_{j}(t)e^{\mathrm{i}\lambda t}\right).

Hence, it holds that

\displaystyle{\hat{f}}_{ij}(\lambda)-{f}_{ij}(\lambda)=O_{p}\left(\sqrt{\frac{M_{n}}{n}}\right)\quad\text{uniformly in $\lambda\in[-\pi,\pi]$}.

(24)

From (22), (24), and Eichler, (2008, Lemma 3.4), it can be seen that

	$\displaystyle\frac{n}{\sqrt{M_{n}}}\mathop{\text{\large$\int_{\text{\normalsize$\scriptstyle\kern-1.04996pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits\left\|\Phi_{K}\left(\hat{\bm{f}}(\lambda)\right)\right\|^{2}{\rm d}\lambda-\frac{n}{\sqrt{M_{n}}}\mathop{\text{\large$\int_{\text{\normalsize$\scriptstyle\kern-1.04996pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits\left\|\Phi_{K}\left(\hat{\hat{\bm{f}}}(\lambda)\right)\right\|^{2}{\rm d}\lambda$
$\displaystyle=$	$\displaystyle\frac{n}{\sqrt{M_{n}}}\int_{-\pi}^{\pi}{\rm vec}\left(\hat{\bm{f}}(\lambda)-{{\bm{f}}}(\lambda)\right)^{*}\Gamma_{{\phi}}(\lambda){\rm vec}\left(\hat{\bm{f}}(\lambda)-{{\bm{f}}}(\lambda)\right)$
	$\displaystyle-\left(\hat{\hat{\bm{f}}}(\lambda)-{{\bm{f}}}(\lambda)\right)^{*}\Gamma_{{\phi}}(\lambda){\rm vec}\left(\hat{\hat{\bm{f}}}(\lambda)-{{\bm{f}}}(\lambda)\right){\rm d}\lambda+o_{p}\left(1\right)$
$\displaystyle=$	$\displaystyle\frac{n}{\sqrt{M_{n}}}\int_{-\pi}^{\pi}{\rm vec}\left(\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right)^{*}\Gamma_{\Phi_{K}}(\lambda){\rm vec}\left(\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right)$
	$\displaystyle+{\rm vec}\left(\hat{\bm{f}}(\lambda)-{{\bm{f}}}(\lambda)\right)^{*}\Gamma_{\Phi_{K}}(\lambda){\rm vec}\left(\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right)$
	$\displaystyle+{\rm vec}\left(\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right)^{*}\Gamma_{\Phi_{K}}(\lambda){\rm vec}\left(\hat{\bm{f}}(\lambda)-{{\bm{f}}}(\lambda)\right){\rm d}\lambda+o_{p}\left(1\right)$
$\displaystyle=$	$\displaystyle O_{p}\left(\frac{M_{n}^{3/2}}{n}+\frac{M_{n}}{n^{1/2}}\right)+o_{p}\left(1\right),$	(25)

where

\displaystyle\Gamma_{\Phi_{K}}(\lambda):={\rm vec}\left.\left(\frac{\overline{\partial\Phi_{K}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right){\rm vec}\left(\frac{{\partial\Phi_{K}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right)^{\top}\right|_{{\bm{Z}}={\bm{f}}(\lambda)}.

Thus (21) follows from (23) and (25).

Second, we show the asymptotic normality of $T\left(\hat{\hat{\bm{f}}}(\lambda)\right)$ . To this end, we check the conditions (i)–(iv) of Eichler, (2008, Assumption 3.2), that is, for the function $\Phi_{K}\left(\cdot\right)$ on the open set $\bm{D}:=\{{\bm{Z}}=(z_{ij})_{i,j=0,\ldots,K}\in\mathbb{C}^{(k+1)\times(k+1)};\|{\bm{Z}}\|<2L_{D}\}$ to $\mathbb{C}$ such that

\displaystyle\Phi_{K}\left(\bm{Z}\right):=

\displaystyle-\sum_{i=1}^{K-1}{\rm det}\left(\overline{{\bm{Z}}_{i,K}^{\ddagger}}\right)z_{i0}+{\rm det}\left({\bm{Z}}_{K-1}\right)z_{K0},

where ${\bm{Z}}_{K-1}:=(z_{ij})_{i,j=1,\ldots,(K-1)}$ and

\displaystyle{\bm{Z}}_{i,K}^{\ddagger}:=\begin{pmatrix}z_{11}&\cdots&z_{1(i-1)}&z_{1K}&z_{1(i+1)}&\cdots&z_{1(K-1)}\\ \vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots\\ z_{(K-1)1}&\cdots&z_{(K-1)(i-1)}&z_{(K-1)K}&z_{(K-1)(i+1)}&\cdots&z_{(K-1)(K-1)}\\ \end{pmatrix},

(i)

The function $\Phi_{K}$ is holomorophic function.
(ii)

The functions $\Phi_{K}(\bm{f}(\lambda))$ and $\left.{\rm vec}\left(\frac{{\partial\Phi_{K}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right)\right|_{{\bm{Z}}={\bm{f}}(\lambda)}$ are Lipschitz continuous with respect to $\lambda\in[-\pi,\pi]$ .
(iii)

It holds that, for $B_{L_{D},\lambda}:=\{{\bm{Z}}\in\mathbb{C}^{(k+1)\times(k+1)};\|{\bm{Z}}-{\bm{f}}(\lambda)\|<L_{D}\}$ ,

$\sup_{\lambda\in[-\pi\pi]}\sup_{\bm{Z}\in B_{L_{D},\lambda}}\left\|\Phi_{K}\left({\bm{Z}}\right)\right\|<\infty.$

(iv)

The following holds true:

\mathop{\text{\large$\int_{\text{\normalsize$\scriptstyle\kern-1.04996pt-\pi$}}^{\text{\normalsize$\scriptstyle\pi$}}$}}\nolimits\left\|\left.{\rm vec}\left(\frac{{\partial\Phi_{K}\left({\bm{Z}}\right)}}{\partial{\bm{Z}}}\right)\right|_{{\bm{Z}}={\bm{f}}(\lambda)}\right\|>0.

From the definition of $\Phi_{K}$ , (i) and (ii) are satisfied. The condition (iii) can be shown by

		$\displaystyle\sup_{\lambda\in[-\pi,\pi]}\sup_{\bm{Z}\in B_{L_{D},\lambda}}\left\\|\Phi_{K}\left({\bm{Z}}\right)\right\\|$
	$\displaystyle\leq$	$\displaystyle\sup_{\lambda\in[-\pi,\pi]}\sup_{\bm{Z}\in B_{L_{D},\lambda}}\left\\|\Phi_{K}\left({\bm{Z}}\right)-\Phi_{K}\left({\bm{f}}(\lambda)\right)\right\\|+\sup_{\lambda\in[-\pi,\pi]}\left\\|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right\\|$
	$\displaystyle\leq$	$\displaystyle\sup_{\lambda\in[-\pi,\pi]}\sup_{\bm{Z}\in B_{L_{D},\lambda}}L_{\Phi_{K}}\left\\|{{\bm{Z}}}-{{\bm{f}}(\lambda)}\right\\|+\sup_{\lambda\in[-\pi,\pi]}\left\\|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right\\|$
	$\displaystyle\leq$	$\displaystyle L_{\Phi_{K}}L_{D}+\sup_{\lambda\in[-\pi,\pi]}\left\\|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right\\|$
	$\displaystyle<$	$\displaystyle\infty.$

Since

\left.\left(\frac{{\partial\Phi_{K}\left({\bm{Z}}\right)}}{\partial{z_{K0}}}\right)\right|_{{\bm{Z}}={\bm{f}}(\lambda)}={\rm det}\left({\bm{f}}_{K-1}(\lambda)\right)\neq 0,

(iv) holds. Eichler, (2008, Corollary 3.6.) yields the asymptotic normality of $T\left(\hat{\hat{\bm{f}}}(\lambda)\right)$ . A lengthy algebra gives the bias $\mu_{K}$ and asymptotic variance $\sigma_{K}^{2}$ as, for $K=1$ ,

\displaystyle\mu_{1}:=

\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}f_{11}(\lambda)f_{00}(\lambda){\rm d}\lambda\quad\text{and }\sigma_{1}^{2}:=4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}\left\|f_{11}(\lambda)f_{00}(\lambda)\right\|^{2}{\rm d}\lambda

and, for $K\geq 2$ ,

	$\displaystyle\mu_{K}:=$	$\displaystyle{\sqrt{M}_{n}}\eta_{\omega,2}\int_{-\pi}^{\pi}(\det{\bm{f}_{K-1}(\lambda)})^{2}\left(f_{00}(\lambda)-\overline{\hat{\bm{f}}_{K-1,0}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda){\hat{\bm{f}}_{K-1,0}^{\flat}}(\lambda)\right)$
		$\displaystyle\quad\quad\quad\quad\quad\times\left(f_{KK}(\lambda)-\overline{\hat{\bm{f}}_{K-1,K}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda)\hat{\bm{f}}_{K-1,K}^{\flat}(\lambda)\right){\rm d}\lambda$
	$\displaystyle\text{and }\sigma_{K}^{2}:=$	$\displaystyle 4\pi\eta_{\omega,4}\int_{-\pi}^{\pi}\left\\|(\det{\bm{f}_{K-1}(\lambda)})^{2}\left(f_{00}(\lambda)-\overline{\hat{\bm{f}}_{K-1,0}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda){\hat{\bm{f}}_{K-1,0}^{\flat}}(\lambda)\right)\right.$
		$\displaystyle\quad\quad\quad\quad\quad\times\left.\left(f_{KK}(\lambda)-\overline{\hat{\bm{f}}_{K-1,K}^{\flat\top}}(\lambda)\bm{f}_{K-1}^{-1}(\lambda)\hat{\bm{f}}_{K-1,K}^{\flat}(\lambda)\right)\right\\|^{2}{\rm d}\lambda.$

∎

A.4 Proof of Theorem 3.2

Eichler, (2008, Theorem 5.1) yields, under the alternative $K_{0}$ ,

\displaystyle\frac{\sqrt{M_{n}}}{n}{T_{n}}-\int_{-\pi}^{\pi}\left|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right|^{2}{\rm d}\lambda=o_{p}(1)\quad\text{as $n\to\infty$.}

Then, we observe, under the alternative $K_{0}$ ,

	$\displaystyle{\rm P}\left(\frac{T_{n}}{\hat{\sigma}_{K}}\geq z_{\alpha}\right)=$	$\displaystyle{\rm P}\left(\frac{\sqrt{M_{n}}}{n}\frac{T_{n}}{\hat{\sigma}_{K}}\geq\frac{\sqrt{M_{n}}}{n}z_{\alpha}\right)$
	$\displaystyle=$	$\displaystyle{\rm P}\left(\frac{1}{\sigma_{K}}\int_{-\pi}^{\pi}\left\|\Phi_{K}\left({\bm{f}}(\lambda)\right)\right\|^{2}{\rm d}\lambda\geq 0\right)+o_{p}(1)\quad\text{as $n\to\infty$,}$

which shows the consistency of our test.∎

A.5 Proof of Theorem 3.3

Since $\hat{A}_{ij}$ and $A_{ij}$ for $i,j(\leq i)$ are functions of $\hat{\bm{f}}(\lambda)$ and ${\bm{f}}\left(\lambda\right)$ , respectively, write $\hat{A}_{ij}$ and $A_{ij}$ as $\mathcal{A}_{ij}\left(\hat{\bm{f}}(\lambda)\right)$ and $\mathcal{A}_{ij}\left({\bm{f}}(\lambda)\right)$ , respectively. Note that $\mathcal{A}_{ij}$ is holomorphic, and thus, $\mathcal{A}_{ij}$ is Lipschitz continuous on the closed subset $\left\{{\bm{Z}}\in\mathbb{C}^{(K+1)\times(K+1)}:\left\|{\bm{Z}}-{\bm{f}}(\lambda)\right\|_{2}\leq\delta\right\}$ for some $\delta>0$ . Therefore, we can show, for any $\epsilon>0$ and a Lipschitz constant $L$ ,

		$\displaystyle{\rm P}\left(\left\|\hat{a}_{ij}(k)-a_{ij}(k)\right\|>\epsilon\right)$
	$\displaystyle=$	$\displaystyle{\rm P}\left(\left\|\frac{1}{2\pi}\int_{-\pi}^{\pi}\left(\mathcal{A}_{ij}\left(\hat{\bm{f}}(\lambda)\right)-\mathcal{A}_{ij}\left({\bm{f}}(\lambda)\right)\right)e^{-\mathrm{i}k\lambda}{\rm d}\lambda\right\|>\epsilon\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(\frac{L}{2\pi}\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\epsilon,\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|\leq\delta\right)$
		$\displaystyle+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right)$

which, in conjunction with $\sup_{\lambda\in[-\pi,\pi]}\left\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\|_{2}=o_{p}(1)$ (see, e.g., Robinson,, 1991, Theorem 2.1), tends to zero as $n\to\infty$ .

∎

{acks}

[Acknowledgments] We thank Brin Mathematics Research Center for supporting the first author’s (Y.G.) research stay at the University of Maryland. We also thank Ms. Tong Lu for important assistance with data retrieval. Part of this research was conducted while the first author (Y.G.) was visiting the Brin Mathematics Research Center at the University of Maryland.

{funding}

This research was supported by JSPS Grant-in-Aid for Early-Career Scientists JP23K16851 (Y.G.) and NIH grant DP1 DA048968 (S.C.)

References

Andrews, (1991) Andrews, D. W. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica, 59(3):817–858.
Aue and Van Delft, (2020) Aue, A. and Van Delft, A. (2020). Testing for stationarity of functional time series in the frequency domain. Ann. Statist., 48(5):2505–2547.
Baruník and Kley, (2019) Baruník, J. and Kley, T. (2019). Quantile coherency: A general measure for dependence between cyclical economic variables. Econom. J., 22(2):131–152.
Bowyer, (2016) Bowyer, S. M. (2016). Coherence a measure of the brain networks: past and present. Neuropsychiatr. Electrophysiol., 2(1):1–12.
Brillinger, (1981) Brillinger, D. R. (1981). Time series: Data Analysis and Theory. San Francisco: Holden-Day, expanded edition.
Brockwell and Davis, (1991) Brockwell, P. J. and Davis, R. A. (1991). Time series: Theory and Methods. New York: Springer-Verlag.
Bücher et al., (2020) Bücher, A., Dette, H., and Heinrichs, F. (2020). Detecting deviations from second-order stationarity in locally stationary functional time series. Ann. Inst. Statist. Math., 72(4):1055–1094.
Cao et al., (2022) Cao, J., Zhao, Y., Shan, X., Wei, H.-l., Guo, Y., Chen, L., Erkoyuncu, J. A., and Sarrigiannis, P. G. (2022). Brain functional and effective connectivity based on electroencephalography recordings: A review. Hum. Brain Mapp., 43(2):860–879.
Culbreth et al., (2021) Culbreth, A. J., Wu, Q., Chen, S., Adhikari, B. M., Hong, L. E., Gold, J. M., and Waltz, J. A. (2021). Temporal-thalamic and cingulo-opercular connectivity in people with schizophrenia. NeuroImage Clin., 29:102531.
Dahlhaus, (2000) Dahlhaus, R. (2000). Graphical interaction models for multivariate time series1. Metrika, 51(2):157–172.
Dette and Paparoditis, (2009) Dette, H. and Paparoditis, E. (2009). Bootstrapping frequency domain tests in multivariate time series with an application to comparing spectral densities. J. R. Stat. Soc., B: Stat. Methodol., 71(4):831–857.
Doukhan, (1994) Doukhan, P. (1994). Mixing. In Mixing: Properties and Examples. Lecture Notes in Statistics, vol 85. Springer, New York.
Eichler, (2008) Eichler, M. (2008). Testing nonparametric and semiparametric hypotheses in vector stationary processes. J. Multivar. Anal., 99(5):968–1009.
Euan et al., (2019) Euan, C., Sun, Y., and Ombao, H. (2019). Coherence-based time series clustering for statistical inference and visualization of brain connectivity. Ann. Appl. Stat., 13(2):990–1015.
Fan et al., (2019) Fan, F., Xiang, H., Tan, S., Yang, F., Fan, H., Guo, H., Kochunov, P., Wang, Z., Hong, L. E., and Tan, Y. (2019). Subcortical structures and cognitive dysfunction in first episode schizophrenia. Psychiatry Res. Neuroimaging, 286:69–75.
Fan et al., (2016) Fan, L., Li, H., Zhuo, J., Zhang, Y., Wang, J., Chen, L., Yang, Z., Chu, C., Xie, S., Laird, A. R., et al. (2016). The human brainnetome atlas: a new brain atlas based on connectional architecture. Cereb., 26(8):3508–3526.
Fiecas and Ombao, (2011) Fiecas, M. and Ombao, H. (2011). The generalized shrinkage estimator for the analysis of functional connectivity of brain signals. Ann. Appl. Stat., 5(2A):1102–1125.
Francq and Zakoian, (2010) Francq, C. and Zakoian, J.-M. (2010). GARCH models: structure, statistical inference and financial applications. John Wiley & Sons.
Hannan, (1970) Hannan, E. J. (1970). Multiple Time Series, volume 38. John Wiley & Sons.
Jenkins and Watts, (1968) Jenkins, G. M. and Watts, D. G. (1968). Spectral analysis and its applications. Holden-Day, San Francisco.
Jentsch and Pauly, (2015) Jentsch, C. and Pauly, M. (2015). Testing equality of spectral densities using randomization techniques. Bernoulli, 21(2):697–739.
Kakizawa et al., (1998) Kakizawa, Y., Shumway, R. H., and Taniguchi, M. (1998). Discrimination and clustering for multivariate time series. J. Amer. Statist. Assoc., 93(441):328–340.
Katznelson, (2004) Katznelson, Y. (2004). An introduction to harmonic analysis. Cambridge University Press.
Kedem, (2016) Kedem, B. (2016). Coherence consideration in binary time series analysis. Handbook of Discrete-Valued Time Series, 311.
Kedem-Kimelfeld, (1975) Kedem-Kimelfeld, B. (1975). Estimating the lags of lag processes. J. Amer. Statist. Assoc., 70(351a):603–605.
Khan et al., (2014) Khan, D., Katzoff, M., and Kedem, B. (2014). Coherence structure and its application in mortality forecasting. J. Stat. Theory Pract., 8(4):578–590.
Kimelfeld, (1974) Kimelfeld, B. (1974). Estimating the kernels of nonlinear orthogonal polynomial functionals. Ann. Statist., 2(1):353–358.
Kley, (2014) Kley, T. (2014). Quantile-Based Spectral Analysis: Asymptotic Theory and Computation. PhD thesis, Ruhr-Universität Bochum.
Kley et al., (2016) Kley, T., Volgushev, S., Dette, H., Hallin, M., et al. (2016). Quantile spectral processes: Asymptotic analysis and inference. Bernoulli, 22(3):1770–1807.
Lee and Subba Rao, (2017) Lee, J. and Subba Rao, S. (2017). A note on general quadratic forms of nonstationary stochastic processes. Statistics, 51(5):949–968.
Lenart, (2011) Lenart, Ł. (2011). Asymptotic distributions and subsampling in spectral analysis for almost periodically correlated time series. Bernoulli, 17(1):290–319.
Liu et al., (2021) Liu, Y., Taniguchi, M., and Ombao, H. (2021). Statistical inference for local granger causality. ArXiv E-prints. Available at arXiv:2103.00209.
Liu et al., (2018) Liu, Y., Zhang, Y., Lv, L., Wu, R., Zhao, J., and Guo, W. (2018). Abnormal neural activity as a potential biomarker for drug-naive first-episode adolescent-onset schizophrenia with coherence regional homogeneity and support vector machine analyses. Schizophrenia research, 192:408–415.
Matsuda, (2006) Matsuda, Y. (2006). A test statistic for graphical modelling of multivariate time series. Biometrika, 93(2):399–409.
Mohanty et al., (2020) Mohanty, R., Sethares, W. A., Nair, V. A., and Prabhakaran, V. (2020). Rethinking measures of functional connectivity via feature extraction. Sci. Rep., 10(1):1–17.
Mokkadem, (1988) Mokkadem, A. (1988). Mixing properties of arma processes. Stochastic Process. Appl., 29(2):309–315.
Müller et al., (2001) Müller, K., Lohmann, G., Bosch, V., and Von Cramon, D. Y. (2001). On multivariate spectral analysis of fmri time series. NeuroImage, 14(2):347–356.
Neumann, (1996) Neumann, M. H. (1996). Spectral density estimation via nonlinear wavelet methods for stationary non-gaussian time series. J. Time Ser. Anal., 17(6):601–633.
Ombao and Van Bellegem, (2008) Ombao, H. and Van Bellegem, S. (2008). Evolutionary coherence of nonstationary signals. IEEE Trans. Signal Process., 56(6):2259–2266.
(40) Panaretos, V. M. and Tavakoli, S. (2013a). Fourier analysis of stationary time series in function space. Ann. Statist., 41(2):568–603.
(41) Panaretos, V. M. and Tavakoli, S. (2013b). Supplement to “fourier analysis of stationary time series in function space”.
Politis, (2011) Politis, D. N. (2011). Higher-order accurate, positive semidefinite estimation of large-sample covariance and spectral density matrices. Econ. Theory, 27(4):703–744.
Robinson, (1991) Robinson, P. M. (1991). Automatic frequency domain inference on semiparametric and nonparametric models. Econometrica, 59:1329–1363.
Rudzkis, (1993) Rudzkis, R. (1993). On the distribution of supremum-type functionals of nonparametric estimates of probability and spectral densities. Theory Probab. Appl., 37(2):236–249.
Shao, (2010) Shao, X. (2010). The dependent wild bootstrap. J. Amer. Statist. Assoc., 105(489):218–235.
Statulevicius and Jakimavicius, (1988) Statulevicius, V. and Jakimavicius, D. (1988). Estimates of semiinvariants and centered moments of stochastic processes with mixing. i. Lith. Math. J., 28(1):67–80.
Taniguchi and Kakizawa, (2000) Taniguchi, M. and Kakizawa, Y. (2000). Asymptotic Theory of Statistical Inference for Time Series. New York: Springer-Verlag.
Taniguchi et al., (1996) Taniguchi, M., Puri, M. L., and Kondo, M. (1996). Nonparametric approach for non-gaussian vector stationary processes. J. Multivar. Anal., 56(2):259–283.
Wang et al., (2015) Wang, R., Wang, J., Yu, H., Wei, X., Yang, C., and Deng, B. (2015). Power spectral density and coherence analysis of alzheimer’s eeg. Cogn. Neurodyn,, 9(3):291–304.
Woodroofe and Van Ness, (1967) Woodroofe, M. B. and Van Ness, J. W. (1967). The maximum deviation of sample spectral densities. Ann. Math. Statist., 38(5):1558–1569.
Wu and Zaffaroni, (2018) Wu, W. B. and Zaffaroni, P. (2018). Asymptotic theory for spectral density estimates of general multivariate time series. Econ. Theory, 34(1):1–22.
Yajima and Matsuda, (2009) Yajima, Y. and Matsuda, Y. (2009). On nonparametric and semiparametric testing for multivariate linear time series. Ann. Statist., 37(6A):3529–3554.
Zhang and Kedem, (2021) Zhang, X. and Kedem, B. (2021). Extended residual coherence with a financial application. Stat. Transit., 22(2):1–14.
Zhao et al., (2019) Zhao, Y., Zhao, Y., Durongbhan, P., Chen, L., Liu, J., Billings, S., Zis, P., Unwin, Z. C., De Marco, M., Venneri, A., et al. (2019). Imaging of nonlinear and dynamic functional brain connectivity based on eeg recordings with the application on the diagnosis of alzheimer’s disease. IEEE Trans. Med. Imaging, 39(5):1571–1581.

		$\displaystyle{\rm P}\left(M_{n}\left\|\mu\left(\hat{\bm{f}}(\lambda)\right)-\mu\left(\hat{\hat{\bm{f}}}{(\lambda)}\right)\right\|>\epsilon\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(M_{n}L_{\mu}\int_{-\pi}^{\pi}\left\\|\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right\\|_{2}{\rm d}\lambda>\epsilon,\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}\leq\delta\right)$
		$\displaystyle+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(M_{n}L_{\mu}\int_{-\pi}^{\pi}\left\\|\hat{\bm{f}}(\lambda)-\hat{\hat{\bm{f}}}(\lambda)\right\\|_{2}{\rm d}\lambda>\epsilon\right)+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right),$

		$\displaystyle{\rm P}\left(\left\|\hat{a}_{ij}(k)-a_{ij}(k)\right\|>\epsilon\right)$
	$\displaystyle=$	$\displaystyle{\rm P}\left(\left\|\frac{1}{2\pi}\int_{-\pi}^{\pi}\left(\mathcal{A}_{ij}\left(\hat{\bm{f}}(\lambda)\right)-\mathcal{A}_{ij}\left({\bm{f}}(\lambda)\right)\right)e^{-\mathrm{i}k\lambda}{\rm d}\lambda\right\|>\epsilon\right)$
	$\displaystyle\leq$	$\displaystyle{\rm P}\left(\frac{L}{2\pi}\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\epsilon,\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|\leq\delta\right)$
		$\displaystyle+{\rm P}\left(\sup_{\lambda\in[-\pi,\pi]}\left\\|\hat{\bm{f}}(\lambda)-{\bm{f}}(\lambda)\right\\|_{2}>\delta\right)$