Wavelet Based Periodic Autoregressive Moving Average Models

\nameRhea Davis^a and N. Balakrishna^a,b CONTACT Rhea Davis. Email: rheadavisc@gmail.com ^aDepartment of Statistics, Cochin University of Science and Technology, Kochi, India; ^bDepartment of Mathematics and Statistics, Indian Institute of Technology, Tirupati, India

Abstract

This paper proposes a wavelet-based method for analysing periodic autoregressive moving average (PARMA) time series. Even though Fourier analysis provides an effective method for analysing periodic time series, it requires the estimation of a large number of Fourier parameters when the PARMA parameters do not vary smoothly. The wavelet-based analysis helps us to obtain a parsimonious model with a reduced number of parameters. We have illustrated this with simulated and actual data sets.

keywords:

Fourier methods; hypothesis testing; parsimony, periodic ARMA; wavelets

1 Introduction

Time series with seasonal or periodic patterns occur in many fields of applications such as economics, finance, demography, astronomy, meteorology, etc. The commonly used seasonal autoregressive integrated moving average (SARIMA) models may not capture some of the important local features of such time series (see [14]). An alternative approach suggested is to model these data by periodically stationary time series. A process $\{\tilde{Y}_{t}\}$ is said to be periodically (weak) stationary if mean function $\mu_{t}=E(\tilde{Y}_{t})$ and autocovariance function $\gamma_{t}(h)=Cov(\tilde{Y}_{t},\tilde{Y}_{t+h}),h\in\mathbb{Z}$ , are periodic functions with same period $\nu$ (see [20]). A particular class of models having periodic stationarity is the periodic autoregressive moving average models of period $\nu$ having autoregressive parameter $p$ and moving average parameter $q$ [ $\text{PARMA}_{\nu}(p,q)$ ], given in [6] is

Y_{t}-\sum_{j=1}^{p}\phi_{t}(j)Y_{t-j}=\varepsilon_{t}-\sum_{j=1}^{q}\theta_{t}(j)\varepsilon_{t-j},t=1,2,\dotsc,

(1)

where (i) $Y_{t}=\tilde{Y}_{t}-\mu_{t},\mu_{t}=\mu_{t+k\nu},k\in\mathbb{Z},$ (ii) $\phi_{t}=\phi_{t+k\nu},k\in\mathbb{Z}$ , (iii) $\theta_{t}=\theta_{t+k\nu},k\in\mathbb{Z}$ , and (iv) $\sigma^{2}_{t}=\sigma^{2}_{t+k\nu},k\in\mathbb{Z}$ , where $\sigma^{2}_{t}$ is the variance of the mean zero random sequence $\{\varepsilon_{t}\}$ . Thus, $\{\delta_{t}\}=\{\sigma_{t}^{-1}\varepsilon_{t}\}$ is an iid sequence of mean 0 and variance 1 random variables. Note that PARMA reduces to ARMA when $\nu=1$ . PARMA models have been applied in fields as diverse as economics ([21] and [9]), climatology ([13], [7] and [4]), signal processing [10] and hydrology ([25], [24], [6], [1] and [5]). However, PARMA models contain a large number of parameters. For example, for a monthly series of period $\nu=12$ , the total number of parameters to be estimated is $12\times 4=48$ . Often, the inclusion of a large number of parameters leads to overfitting. Hence, it is important to develop methods that result in parsimonious models having less number of parameters.

This article is organised in the following manner: In Section 2, the estimators of the PARMA model and their several properties are mentioned. The Fourier-PARMA model is presented in Section 3. Section 4 gives a brief introduction to wavelet analysis with a special focus on discrete wavelet transform. In Section 5, our proposed wavelet-PARMA model is expounded. Section 6 consists of a simulation study of our proposed wavelet-PARMA model. The applicability of the proposed model is demonstrated using real data in Section 7. Section 8 contains the conclusion of our study. We have utilised the results from various articles, and for easy reference, we have included them in the Appendices.

2 Estimation of PARMA parameters

It is assumed hereafter that model (1) admits a causal representation (see [4])

Y_{t}=\sum_{j=0}^{\infty}\psi_{t}(j)\varepsilon_{t-j},

where $\psi_{t}(0)=1$ and $\sum_{j=0}^{\infty}\left|\psi_{t}(j)\right|<\infty,$ for all $t$ . Note that $\psi_{t}(j)=\psi_{t+kv}(j)$ for all $j$ . Also, $Y_{t}=\tilde{Y}_{t}-\mu_{t}$ and $\varepsilon_{t}=\sigma_{t}\delta_{t}$ where $\left\{\delta_{t}\right\}$ is i.i.d.

Given data $\tilde{Y}_{0},\tilde{Y}_{1},\dotsc,\tilde{Y}_{N\nu-1}$ , where $N$ is the number of cycles, the vector $\boldsymbol{\mu}=\{\mu_{0},\dotsc,\mu_{\nu-1}\}^{\prime}$ is estimated using

\hat{\mu}_{i}=\frac{1}{N}\sum_{j=0}^{N-1}\tilde{Y}_{j\nu+i},\,\,i=0,1,\dotsc,\nu-1.

(2)

The sample autocovariance for lag $m$ and season $i$ is estimated using

\hat{\gamma}_{i}(m)=\frac{1}{N}\sum_{j=0}^{N-1}(\tilde{Y}_{j\nu+i}-\hat{\mu}_{i})(\tilde{Y}_{j\nu+i+m}-\hat{\mu}_{i+m}),\,\,i=0,1,\dotsc,\nu-1,

(3)

where $\hat{\mu}_{i}$ is estimated using (2) and whenever $t>N\nu-1$ , the corresponding term is set as zero. Let

\hat{Y}_{i+n}=\sum_{j=1}^{n}\theta_{n,j}^{(i)}(Y_{i+n-j}-\hat{Y}_{i+n-j}),\,n\geq 1,

denote the one-step predictor of $Y_{i+n}$ that minimises the mean square error, $v_{n,i}=E(Y_{i+n}-\hat{Y}_{i+n})^{2}$ . The estimates of $\theta_{n,j}^{(i)}$ and $v_{n,i}$ can be obtained by substituting the estimates of autocovariance function (3) in the recursive relation of innovation algorithm for periodic stationary processes given in Theorem A1. Then, Theorem A2 and Theorem A3 show that $\hat{\theta}_{n,j}^{\langle i-n\rangle}$ and $\hat{v}_{n,\langle i-n\rangle}$ are consistent estimators for $\psi_{i}(j),$ for all $j$ , and $\sigma_{i}^{2}$ respectively. Here the notation, $\langle b\rangle$ is $b-\nu\lfloor b/{\nu}\rfloor\,\text{for}\,b=0,1,\dotsc$ and $\nu+b-\nu\lfloor b/{\nu}+1\rfloor$ for $b=-1,-2,\dotsc$ , where $\lfloor\cdot\rfloor$ is the greatest integer function.

To demonstrate our proposed ideas we are focusing particularly on two classes of PARMA models, described in the following subsections.

2.1 Periodic autoregressive model of order 1 (PAR(1))

Consider the $\text{PAR}_{\nu}(1)$ model given by,

Y_{t}=\phi_{t}Y_{t-1}+\varepsilon_{t},

(4)

where (i) $Y_{t}=\tilde{Y}_{t}-\mu_{t},\mu_{t}=\mu_{t+k\nu},k\in\mathbb{Z},$ (ii) $\phi_{t}=\phi_{t+k\nu},k\in\mathbb{Z}$ and (iii) $\sigma_{t}^{2}=\sigma_{t+k\nu}^{2},k\in\mathbb{Z}$ where $\sigma_{t}^{2}$ is the variance of the mean zero normal random sequence $\{\varepsilon_{t}\}$ . For the model (4), Theorem A4 tells us that $\{\psi_{0}(1),\psi_{1}(1),\dotsc,\psi_{\nu-1}(1)\}^{\prime}=\{\phi_{0},\phi_{1},\dotsc,\phi_{\nu-1}\}^{\prime}$ . Thus, $\hat{\boldsymbol{\phi}}=\{\hat{\phi}_{0},\hat{\phi}_{1},\dotsc,\hat{\phi}_{\nu-1}\}^{\prime}$ can be obtained using the consistent estimates of $\psi_{i}(1),i=0,\dotsc,\nu-1.$ Moreover, Theorem A5 tells us that the estimator of $\boldsymbol{\phi}$ has asymptotic normality. Also, $\hat{\boldsymbol{\mu}}$ and $\hat{\boldsymbol{\sigma}}=\{\hat{v}_{n,\langle 0-n\rangle},\hat{v}_{n,\langle 1-n\rangle},\dotsc,\hat{v}_{n,\langle\nu-1-n\rangle}\}^{\prime}$ have asymptotic normality as stated by Theorem A6 and Theorem A7, respectively.

2.2 PARMA(1,1)

Consider the $\text{PARMA}_{\nu}(1,1)$ given by,

\tilde{Y}_{t}=\phi_{t}\tilde{Y}_{t-1}+\varepsilon_{t}-\theta_{t}\varepsilon_{t-1},

(5)

where (i) $\phi_{t}=\phi_{t+k\nu},k\in\mathbb{Z}$ (ii) $\theta_{t}=\theta_{t+k\nu},k\in\mathbb{Z}$ and $\{\varepsilon_{t}\}$ is a sequence of normal random variables with mean 0 and variance 1. For the model (5), Theorem A8 tells us that,

\phi_{t}=\frac{\psi_{t}(2)}{\psi_{t-1}(1)}.

(6)

\theta_{t}=\phi_{t}-\psi_{t}(1).

(7)

Thus, $\hat{\phi}_{t}$ and $\hat{\theta}_{t}$ are obtained by replacing $\psi_{t}$ with its corresponding consistent estimate. The estimator of $\boldsymbol{\phi}$ has asymptotic normality as given in Theorem A9. Also, Theorem A10 tells us that $\hat{\boldsymbol{\theta}}=\{\hat{\theta}_{1},\hat{\theta}_{2},\dotsc,\hat{\theta}_{\nu-1}\}^{\prime}$ also has asymptotic normality.

After estimating the PARMA parameters our objective is to reduce the number of parameters so that the final model is still able to capture the dependency structure adequately. Since parameter functions are periodic, it is natural to consider Fourier techniques to meet the objectives.

3 Fourier-PARMA model

Often, it is found that representing a function in terms of basis terms helps us to get new insights about the function, which is otherwise obscure. The selection of the set of basis functions depends on the kind of information we are interested in. The analysis of periodic functions usually involves the identification of prominent frequencies present in the signal. For this the function is represented as an infinite series of sinusoids, known as Fourier series, which is possible for almost all functions encountered in practice. The magnitude of the coefficients associated with sines and cosines indicates the strength of the corresponding frequency component present in the process.

For a given vector $\mathbf{X}$ , from [4], the periodic function $\mathbf{X}=[X_{t}:0\leq t\leq\nu-1]$ has the representation

X_{t}=c_{0}+\sum_{r=1}^{\ell}\left\{c_{r}\,cos\left(\frac{2\pi rt}{\nu}\right)+s_{r}\,sin\left(\frac{2\pi rt}{\nu}\right)\right\},

(8)

where $c_{r}$ and $s_{r}$ are Fourier coefficients and

\ell=\begin{cases}\nu/2,\quad\quad\quad\,\,\,\text{if}\,\nu\,even\\ (\nu-1)/2,\quad\text{if}\,\nu\,odd\end{cases}.

For convenience, (8) is written as,

X_{t}=c_{0}+\sum_{r=1}^{\ell}\left\{c_{r}\,cos^{*}(r)+s_{r}\,sin^{*}(r))\right\},

(9)

where $cos^{*}(r)=cos\left(\frac{2\pi rt}{\nu}\right)$ and $sin^{*}(r)=sin\left(\frac{2\pi rt}{\nu}\right)$ .

Let $\textbf{f}=\begin{cases}[c_{0},c_{1},s_{1},\dotsc,c_{(\nu-1)/2},s_{(\nu-1)/2}]^{\prime}\quad(\nu\,\text{odd})\\ [c_{0},c_{1},s_{1},\dotsc,s_{(\nu/2-1)},c_{(\nu/2)}]^{\prime}\quad\quad(\nu\,\text{even})\end{cases}$ .

Then, we have, from [4],

\textbf{f}=\mathcal{L}\mathcal{P}\mathcal{U}\mathbf{X},

(10)

where $\mathcal{L},\mathcal{P}$ and $\mathcal{U}$ are respectively given in equations (60), (62) and (61) of Theorem A11. $\mathcal{U}$ and $\mathcal{P}$ are unitary matrices. Using (10), we have, $\hat{\textbf{f}}=\mathcal{L}\mathcal{P}\mathcal{U}\hat{\mathbf{X}}$ . Note that X can be retrieved from f by,

\textbf{X}=\tilde{\mathcal{U}}^{\prime}\tilde{\mathcal{P}}^{\prime}\mathcal{L}^{-1},

where $\tilde{\mathcal{A}}^{\prime}$ denotes the conjugate transpose of matrix $\mathcal{A}$ . Thus, X and f are representations of the same entity. Hence, to obtain a parsimonious PARMA model it is enough to represent the model using a reduced number of Fourier coefficients. Since it is observed that PARMA parameters often vary smoothly over time, [6], [5], [23] and [4] identified significant Fourier coefficients via a hypothesis test based on the asymptotic distributions of PARMA estimators. The significant Fourier coefficients are retained as it is while the insignificant coefficients are reduced to zero, thereby resulting in a parsimonious Fourier-PARMA model.

Although some success has been found in employing Fourier techniques they often fail when bursts and other transient events take place, which is often the case in many fields like economics and climatology. This is because, in the case of temporal events, all the Fourier coefficients get unduly affected as all the observations are involved in the computation of the coefficients.

4 Wavelet Analysis

Wavelet techniques are an excellent alternative to Fourier methods. Some applications of wavelets in statistics include estimation of functions like density, regression, and spectrum, analysis of long memory process, data compression, decorrelation, and detection of structural breaks. A “wavelet” means small wave (see [22], [18] and [11]). This essentially means that a wavelet has oscillations like a wave. However, it lasts only for a short duration unlike sine wave, which spans infinitely to both sides of the time axis. They are constructed so that they possess certain desirable properties. Many different wavelet functions and transforms are available in the literature. However, here we are considering only the orthogonal discrete wavelet transform (DWT) since the approximation based on an orthogonal set is best in terms of mean square error, as indicated by projection theorem (see [8]).

Fourier methods are enough when the frequencies of the signal do not change over time. However, Fourier techniques fail when different frequencies occur at different parts of the time axis. Hence, it is imperative to find techniques that have both time resolution and frequency resolution. But, by Heisenberg’s Principle, it is impossible to identify time and frequency simultaneously with arbitrary precision (see [15] and [26]) as there is a lower bound for the error occurred. However, wavelets intelligently bypass this difficulty by using short time windows to capture high frequency, so that we have good time resolution, and wider time windows for low frequencies so that we have good frequency resolution. Thus, wavelets can pick out characteristics that are local in time and frequency. Wavelets are functions of scale, instead of frequency. A scale can be loosely interpreted as a quantity inversely proportional to frequency. This makes the wavelet transform an exceptional tool for studying non-stationary signals. For the latest works on the applications of wavelets to time series, one can refer to [16], [27], and [17].

Let $\mathbf{X}$ be a vector of size $N=2^{J},J\,\in\,\mathbb{Z}^{+}$ , $\mathcal{W}$ is an $N\times N$ orthogonal matrix used for performing discrete wavelet transform (DWT), then the DWT of $\mathbf{X}$ is given by,

\mathbf{W}=\mathcal{W}\mathbf{X}=\begin{bmatrix}{V}_{J}\\ \mathbf{W}_{1}\\ \vdots\\ \mathbf{W}_{J}\end{bmatrix}=\begin{bmatrix}W_{0}\\ W_{1}\\ \vdots\\ W_{N}\end{bmatrix}

(11)

Here, ${V}_{J}=W_{0}$ , is the scaling coefficient that is proportional to the average of all the data, i.e,

{V}_{J}=W_{0}=k_{1}*\frac{1}{N}\sum_{i=1}^{N}X_{i},

(12)

where the proportionality constant $k_{1}$ depends on the particular wavelet employed. $\mathbf{W}_{j},j\in\{1,2,\dotsc,J\}$ contains $2^{j-1}$ wavelet coefficients associated with changes on a scale $N/2^{j-1}$ . For example, the Haar DWT of $\mathbf{X}=[X_{0},\dotsc,X_{15}]^{\prime}$ is given by,

\begin{bmatrix}\frac{1}{4}(X_{15}+\dotsc+X_{0})\\ \frac{1}{4}(-X_{15}+\dotsc-X_{8}+X_{7}-\dotsc+X_{0})\\ \frac{1}{\sqrt{8}}(-X_{7}+\dotsc-X_{4}+X_{3}+\dotsc+X_{0})\\ \frac{1}{\sqrt{8}}(-X_{15}-\dotsc-X_{12}+X_{11}-\dotsc+X_{8})\\ \frac{1}{2}(-X_{3}-X_{2}+X_{1}+X_{0})\\ \vdots\\ \frac{1}{2}(-X_{15}-X_{14}+X_{13}+X_{12})\\ \frac{1}{\sqrt{2}}(-X_{1}+X_{0})\\ \vdots\\ \frac{1}{\sqrt{2}}(-X_{15}+X_{14})\end{bmatrix}.

(13)

Note that the wavelet coefficients are localised in time. For example, the last wavelet coefficient only involves the observations $X_{14}$ and $X_{15}$ . Since $\mathcal{W}$ is an orthogonal matrix, the original vector $\mathbf{X}$ can easily be reconstructed via the inverse DWT

\mathbf{X}=\mathcal{W}^{T}\mathbf{W},

(14)

where $\mathcal{W}^{T}$ denotes the transpose of $\mathcal{W}$ . Note that from (14), X and W are representations of the same quantity and hence it is enough to represent the parameters of the PARMA model in terms of discrete wavelet transform coefficients.

Another interesting advantage of the DWT, especially pertaining to our objective of achieving parsimonious PARMA models, is the fact that DWT redistributes the energy or variability contained in a sequence (see [26]). Hence, the crux of the sequence, spread throughout the original sequence, is concentrated in a few wavelet coefficients (see [19] and [26]). Hence, we propose a wavelet-PARMA model that captures the dependency structure adequately based on the idea of retaining only a few significant wavelet coefficients. We identify the significant wavelet coefficients by developing a hypothesis test, following the parallel approach given in [4] for Fourier coefficients, utilizing the asymptotic distributions of the estimators of the PARMA parameters.

5 Wavelet - PARMA model

Suppose random vector $\hat{\mathbf{X}}$ satisfies,

N^{1/2}(\hat{\mathbf{X}}-\mathbf{X})\Rightarrow\mathcal{N}(0,\Sigma).

(15)

Using Theorem A12, we have,

N^{1/2}(\hat{\mathbf{W}}-\mathbf{W})\Rightarrow\mathcal{N}(0,R_{\textbf{W}}),

(16)

\text{where}\quad R_{\textbf{W}}=\mathcal{W}\Sigma\mathcal{W}^{T}.

(17)

The main idea is to identify the statistically significant wavelet coefficients of the PARMA estimator vectors, using an appropriate test procedure so that the other coefficients can be nullified, thereby obtaining a parsimonious PARMA model. The null hypothesis of the test is $H_{0}:\mathbf{W}=(V_{J},0,\dotsc,0)=(W_{0},0,\dotsc,0)$ . Under $H_{0}$ , $\mathbf{X}=k_{2}*V_{J}=k_{2}*k_{1}*\frac{1}{N}\sum_{i=1}^{N}X_{i}=k*\frac{1}{N}\sum_{i=1}^{N}X_{i}$ , where $k$ depends on the type of wavelet used. Since the parameter vectors have all elements the same under $H_{0}$ , the PARMA process reduces to a stationary ARMA process with $\phi_{t}=\phi$ for all $t\in\{0,1,\dotsc,\nu-1\}$ , and similarly for other parameter vectors. Thus, the null hypothesis states that the model is stationary. From (16), we have,

N^{1/2}(\hat{W}_{i}-W_{i})\Rightarrow\mathcal{N}(0,[{R_{\textbf{W}}}]_{ii}),\,\text{for}\,i=1,2,\dotsc,\nu-1.

(18)

Here, we will use the Bonferroni’s test procedure (see [12]) with $\alpha=0.05$ . In this case, we wish to test the null hypothesis $H_{0}:|W_{i}|=0$ versus $H_{a}:|W_{i}|\neq 0,i=1,2,\dotsc,\nu-1$ . The test statistics is

{Z_{W}}_{i}=\frac{\hat{W}_{i}}{\sqrt{[{R_{\textbf{W}}}]_{i,i}/N}},i=1,2,\dotsc\nu-1.

(19)

Let $\alpha^{*}=\frac{\alpha}{\nu-1}$ and $Z_{\alpha^{*}/2}$ be the $\alpha^{*}/2$ tail quantile, i.e, $P(|Z|>Z_{\alpha^{*}/2})=\alpha^{*}/2$ , where $Z\sim\mathcal{N}(0,1)$ . We reject the null hypothesis when $|{Z_{W}}_{i}|>Z_{{\alpha^{*}}/2}$ . Whenever the null hypothesis is not rejected, the corresponding element $W_{i}$ is set as zero. Hopefully, this will result in a sparse $\mathbf{W}^{\prime}$ vector where most of the coefficients are 0, resulting in a parsimonious wavelet-PARMA model.

5.1 PAR(1) under $H_{0}$

For the model considered in (4), under null hypothesis, $\gamma(h)=\nu^{-1}\sum_{m=0}^{\nu-1}\gamma_{m}(h)$ and $\phi(k)=\nu^{-1}\sum_{m=0}^{\nu-1}\phi_{m}(k)$ . In this case, the asymptotic variance-covariance matrix of $\hat{\boldsymbol{\mu}},\Sigma_{\boldsymbol{\mu}}$ , becomes (see [4]),

(\Sigma_{\boldsymbol{\mu}})_{ii,}=\gamma(0)\left[\frac{1+r}{1-r}\right],\quad\,0\leq i\leq\nu-1,

(20)

where $r=\phi^{\nu}$ and for $j>i$ ,

(\Sigma_{\boldsymbol{\mu}})_{ij}=\gamma(0)\left[\frac{\phi^{j-i}+\phi^{\nu+i-j}}{1-r}\right],\quad 0\leq i,j\leq\nu-1.

(21)

Since $\Sigma_{\boldsymbol{\mu}}$ is a symmetric matrix, we also have the elements $(\Sigma_{\boldsymbol{\mu}})_{ij,i>j,0\leq i,j\leq\nu-1}$ . For hypothesis testing, the elements of $\Sigma_{\boldsymbol{\mu}}$ are replaced with their estimates.

Under $H_{0}$ , the asymptotic variance-covariance matrix of $\hat{\boldsymbol{\sigma^{2}}}$ , $\Sigma_{\boldsymbol{\sigma^{2}}}$ is given by (see [4]),

	$\displaystyle(\Sigma_{\boldsymbol{\sigma^{2}}})_{ii}=$	$\displaystyle 2\gamma(0)^{2}\left[\frac{1+\phi^{2v}}{1-r^{2}}\right]-2\phi\times 2\gamma(0)^{2}\left[\frac{\phi+\phi^{2v-1}}{1-r^{2}}\right]$		(22)
		$\displaystyle+\phi^{2}\gamma(0)^{2}\left[\phi^{2}+\frac{1+3r^{2}}{1-r^{2}}\right],\quad 0\leq i\leq\nu-1,$		(23)

and if $j>i$ , we have,

$\displaystyle(\Sigma_{\boldsymbol{\sigma}^{2}})_{ij}=$	$\displaystyle 2\gamma(0)^{2}\left[\frac{\phi^{2(j-i)}+\phi^{2(v+i-j)}}{1-r^{2}}\right]$	(24)
	$\displaystyle-2\gamma(0)^{2}\phi\left[\frac{\phi^{2j-2i-1}}{1-r^{2}}+\frac{\phi^{2v-2j+2i+1}}{1-r^{2}}\right]-2\gamma(0)^{2}\phi\left[\frac{\phi^{2j-2i+1}+\phi^{2\nu-2j+2i-1}}{1-r^{2}}\right]$
	$\displaystyle+2\gamma(0)^{2}\phi^{2}\left[\frac{\phi^{2j-2i}+\phi^{2v-2j+2i}}{1-r^{2}}\right],\quad 0\leq i,j\leq\nu-1.$	(25)

Since $\Sigma_{\boldsymbol{\sigma^{2}}}$ is a symmetric matrix, we also have the elements $s_{i,j},i>j.$ For hypothesis testing, the elements of $\Sigma_{\boldsymbol{\sigma^{2}}}$ are replaced with their estimates.

Under $H_{0}$ , $\sigma_{i}^{2}=\sigma^{2},$ for all $i$ and so from Theorem A5, we get that the asymptotic variance-covariance matrix of $\hat{\boldsymbol{\phi}},\Sigma_{\boldsymbol{\phi}}$ , as the identity matrix $\mathcal{I}$ of order $\nu$ , i.e., $\Sigma_{\boldsymbol{\phi}}=\mathcal{I}$ .

5.2 PARMA(1,1) under $H_{0}$

For the model considered in (5), under null hypothesis, $\psi(j)=\nu^{-1}\sum_{t=0}^{\nu-1}\psi_{t}(j),j=1,2,\phi=\nu^{-1}\sum_{t=0}^{\nu-1}\phi_{t}$ and $\theta=\nu^{-1}\sum_{t=0}^{\nu-1}\theta_{t}$ . Under $H_{0}$ , the asymptotic variance-covariance matrix of $\hat{\boldsymbol{\phi}},\mathcal{Q}$ , given in Theorem A9, reduces to

\mathcal{Q}=\sum_{k,l=1}^{2}\mathcal{H}_{\ell}\mathcal{V}_{\ell k}\mathcal{H}_{k}^{\prime},

(26)

where

$\displaystyle\mathcal{V}_{11}$	$\displaystyle=\mathcal{I},$	(27)
$\displaystyle\mathcal{V}_{22}$	$\displaystyle=[\boldsymbol{\psi}^{\prime}(1)+1]\mathcal{I},$	(28)
$\displaystyle\mathcal{V}_{12}$	$\displaystyle=\boldsymbol{\psi}(1)\Pi,$	(29)
$\displaystyle\mathcal{V}_{21}$	$\displaystyle=\boldsymbol{\psi}(1)\Pi^{\prime},$	(30)
$\displaystyle\mathcal{H}_{1}$	$\displaystyle=-\mathcal{F}_{2}\Pi^{-1}\mathcal{F}_{1}^{-2}.$	(31)
$\displaystyle\mathcal{H}_{2}$	$\displaystyle=\Pi^{-1}\mathcal{F}_{1}^{-1}\Pi,$	(32)
$\displaystyle\mathcal{F}_{j}$	$\displaystyle=\boldsymbol{\psi}(j)\mathcal{I},\quad$	(33)
$\displaystyle\boldsymbol{\psi}(j)$	$\displaystyle=\{\psi_{0}(j),\psi_{1}(j),\dotsc,\psi_{\nu-1}(j)\}^{\prime},$	(34)
$\displaystyle\text{and}\quad\Pi$	$\displaystyle=\begin{bmatrix}0&1&0&0&\dotsc&0\\ 0&0&1&0&\dotsc&0\\ \vdots&\vdots&\vdots&\vdots&&\vdots\\ 0&0&0&0&\dotsc&1\\ 1&0&0&0&\dotsc&0\end{bmatrix}.$	(35)

Also, the asymptotic variance-covariance matrix of $\hat{\boldsymbol{\theta}},\mathcal{S},$ given in Theorem A10, has the following form under $H_{0}$ :

\mathcal{S}=\sum_{k,l=1}^{2}\mathcal{M}_{\ell}\mathcal{V}_{\ell k}\mathcal{M}_{k}^{\prime},

(36)

where $\mathcal{M}_{1}=-\mathcal{I}-\mathcal{F}_{2}\Pi^{-1}\mathcal{F}_{1}^{-2}$ , $\mathcal{M}_{2}=\Pi^{-1}\mathcal{F}_{1}^{-1}\Pi$ , $\mathcal{V}_{lk},1\leq l,k\leq 2$ are given in equations (27) to (30) , $\mathcal{F}_{j}$ is given in (33) and $\Pi$ is given in (35).

If $\nu$ is not a power of 2, each of the estimator vectors is extended periodically to the nearest power of 2, say $\nu^{\prime}$ . The corresponding asymptotic variance-covariance matrices are also extended periodically, resulting in matrices of order $\nu^{\prime}$ .

6 Simulation of wavelet-PARMA

The observations are simulated from the $\text{PARMA}_{12}(1,1)$ model,

\tilde{Y}_{t}=\phi_{t}\tilde{Y}_{t-1}+\varepsilon_{t}-\theta_{t}\varepsilon_{t-1},

(37)

where (i) $\phi_{t}=\phi_{t+12k}$ , (ii) $\theta_{t}=\theta_{t+12k},k\in\mathbb{Z}$ and $\{\varepsilon_{t}\}$ is a sequence of normal random variables with mean 0 and variance 1. The parameter values and their corresponding estimates are given in Table 6.

\tbl

True value and their estimates of PARMA model. Parameter $\phi$ $\theta$ Season True Estimate True Estimate 0 0.67 0.59 0.2 0.06 1 0.7 0.73 0.23 0.29 2 0.69 0.81 0.22 0.30 3 0.68 0.77 0.21 0.24 4 0.67 0.43 1.43 1.22 5 0.68 0.72 1.44 1.46 6 0.69 0.62 0.46 0.40 7 0.68 1.04 0.47 0.81 8 1.83 1.86 0.23 0.28 9 1.84 1.83 0.24 0.25 10 0.53 0.52 0.21 0.21 11 0.52 0.68 0.23 0.37

Since period 12 is not a power of 2, each of the vectors has been extended periodically to the nearest power of 2, i.e, 16. We have taken the number of cycles, $N$ to be 500.

The plot of the generated observations is given in Figure 1.

Refer to caption — Figure 1: Plot of simulated observations

7 iterations of the innovation algorithm was carried out to calculate the parameter estimates. The residuals are computed using the expression,

\hat{\epsilon}_{t}=\tilde{Y}_{t}-\hat{\phi}_{t}\tilde{Y}_{t-1}+\hat{\theta}_{t}\hat{\varepsilon}_{t-1},

(38)

where the initial value $\hat{\varepsilon}_{0}$ is set as zero. The residual diagnostic plots are given in Figure 2.

The Box Pierce test results are given in Table 6.

\tbl

Box-Pierce test for PARMA residuals Lags 20 30 40 50 80 100 p-value 0.8844 0.8232 0.5431 0.706 0.8313 0.8898

The normality of the residuals was tested using Kolmogorov Smirnov test and it was found that the residuals are normally distributed (p-value = 0.8586). Thus, the residuals are uncorrelated and normally distributed.

The obtained Fourier-PARMA model, by using the method outlined in [4] is given by,

\phi_{t}=0.88-0.375\,sin^{*}(1)-0.27\,cos^{*}(2)+0.30\,sin^{*}(2)-0.195\,sin^{*}(4).

\theta_{t}=0.49-0.37\,cos^{*}(1)+0.195\,sin^{*}(1)-0.22\,sin^{*}(2)-0.25\,cos^{*}(4).

The ACF plot of residuals of the Fourier-PARMA model is given in Figure 3.

The results of Box-Pierce test are given in Table 6.

\tbl

Box-Pierce test for Fourier-PARMA model. Lags 20 30 40 50 80 100 p-value $1.753\times 10^{-5}$ $7\times 10^{-5}$ $4.34\times 10^{-5}$ 0.0003 0.00299 0.0084

Thus, the Fourier-PARMA model is not able to capture the dependencies among the observations adequately as is evident from the ACF plot given in Figure 3 and Box-Pierce test results given in Table 6.

\tbl

DWT analysis of $\phi$ estimates and $\theta$ estimates of $\text{PARMA}_{12}(1,1)$ . The statistically significant DWT coefficients ( $|Z|>2.95$ ) are denoted by *. The coefficients considered in the Wavelet-PARMA model are indicated in bold. Parameter $\phi$ $\theta$ DWT coefficient Z-score DWT coefficient Z-score 0 3.37 - 1.69 - 1 -0.52* -5.30 0.69* 6.12 2 0.03 0.21 -1.06* -6.60 3 0.71* 5.09 0.08 0.48 4 -0.13 -0.94 -0.10 -0.63 5 -0.26 -1.85 0.73* 4.80 6 1.25* 8.97 -0.02 -0.16 7 -0.13 -0.94 -0.10 -0.63 8 -0.10 -0.72 -0.16 -1.21 9 0.03 0.21 0.05 0.36 10 -0.21 -1.49 -0.17 -1.27 11 -0.30 -2.16 -0.29 -2.15 12 0.02 0.15 0.02 0.14 13 -0.11 -0.80 -0.11 -0.85 14 -0.10 -0.72 -0.16 -1.21 15 0.03 0.21 0.05 0.36

The discrete wavelet transform analysis was done using Haar wavelet. This wavelet was chosen for simulation because its structure is explicitly known. The results of Haar DWT analysis are given in Table 6. It is found that only 4 DWT coefficients of $\phi$ and 4 DWT coefficients of $\theta$ need to be incorporated into the wavelet-PAR model and so the final model contains only 8 parameters as opposed to 24 parameters in the original PARMA model. The ACF plot of the residuals of wavelet-PARMA model is given in Figure 4.

The results of Box-Pierce test are given in Table 6.

\tbl

Box-Pierce test for Wavelet-PARMA model. Lags 20 30 40 50 80 100 p-value 0.0889 0.1274 0.0931 0.1923 0.4974 0.6524

The residuals of the wavelet-PARMA model are uncorrelated. Hence, the wavelet-PARMA model with just 8 parameters is able to capture the dependency structure adequately.

7 Data Analysis

The UK Meteorological (MET) Office, established in 1854, is the government body commissioned to analyse weather in the United Kingdom. They use their findings for forecasting and for issuing warnings. They offer data from 36 stations in the UK. For our analysis, we have considered the total sunshine duration in a month recorded in hours by the Ballypatrick weather station for the years 1966-1990. This data is available from the Kaggle site https://www.kaggle.com/datasets/josephw20/uk-met-office-weather-data/code, where the total sunshine duration is given under the variable sun. The data considered is fit using the $\text{PAR}_{12}(1)$ model given in equation (4). Here, period $\nu=12$ and the number of years (cycles), $N=25$ . The plot of the first 10 years is given in Figure 5. It was found that 2 iterations of the innovation algorithm were enough to capture the model dynamics. The residuals of the PAR model was computed using,

\hat{\delta}_{t}=\frac{\hat{Y}_{t}-\hat{\phi}_{t}\hat{Y}_{t-1}}{\hat{\sigma}_{t}},

(39)

where $\hat{Y}_{t}=\tilde{Y}_{t}-\hat{\mu}_{t}$ . The PAR model diagnostics plots are given in Figure 6. The results of the Box-Pierce test are given in Table 7. The normality of the residuals was tested using the Kolmogorov-Smirnov test and it was found that the residuals were normally distributed (p-value = 0.319). Thus, the residuals are uncorrelated and normally distributed.

\tbl

Box-Pierce test for PAR residuals Lags 20 30 p-value 0.0528 0.0601

The obtained Fourier-PAR model, by using the method outlined in [4] is given by,

	$\displaystyle\mu_{t}$	$\displaystyle=106.45-60.67\,cos^{}(1)+35.29\,sin^{}(1)-12.02\,cos^{}(2)+7.75\,cos^{}(3)+6.88\,sin^{*}(4).$
	$\displaystyle\phi_{t}$	$\displaystyle=0.09.$
	$\displaystyle\sigma^{2}_{t}$	$\displaystyle=713.78-729.27\,cos^{}(1)+395.27\,sin^{}(1).$

The ACF plot of Fourier-PAR residuals is given in Figure 7.

The results of Box-Pierce test of Fourier-PAR residuals are given in Table 7.

\tbl

Box-Pierce test of Fourier-PAR residuals. Lags 20 30 p-value $1.3\times 10^{-5}$ $6.76\times 10^{-8}$

Thus, it is clear that the Fourier-PAR model has not been able to capture the dependency structure adequately.

The discrete wavelet transform analysis was done using Daubechies least asymmetric wavelet with 7 vanishing moments. This wavelet was chosen because the resulting wavelet-PAR model could capture the dependency structure adequately. The LA(7) DWT results are given in Table 7. The ACF plot of wavelet-PAR residuals is given in Figure 8. The Box-Pierce test of wavelet-PAR residuals is given in Table 7. From these results, it is clear that the resulting wavelet-PAR model was able to capture the dependency structure adequately. It is found that only 7 DWT coefficients of $\boldsymbol{\mu}$ , 1 DWT coefficient of $\boldsymbol{\phi}$ , and 5 DWT coefficients of $\boldsymbol{\sigma^{2}}$ need to be incorporated into the wavelet-PAR model and so the final model contains only 13 parameters as opposed to 36 parameters in the original PAR model.

\tbl

DWT analysis of $\mu$ estimates, $\phi$ estimates and $\sigma^{2}$ estimates of $\text{PAR}_{12}(1)$ . The statistically significant DWT coefficients ( $|Z|>2.95$ ) are denoted by *. The coefficients considered in the wavelet-PAR model are indicated in bold. Parameter $\mu$ $\phi$ $\ sigma^{2}$ DWT coefficient Z-score DWT coefficient Z-score DWT coefficient Z-score 0 408.88 - 0.39 - 2608.33 - 1 112.69* 16.91 -0.19 -0.95 1393.30* 5.47 2 80.35* 16.96 -0.02 -0.12 848.10* 4.60 3 -56.61* -10.52 0.07 0.35 -603.82 -2.93 4 6.75 1.15 -0.24 -1.19 278.60 1.19 5 -69.44* -10.87 -0.37 -1.87 -639.87 -2.49 6 80.40* 14.36 -0.46 -2.32 898.89* 3.99 7 -7.24 -1.24 0.05 0.26 -469.41 -2.01 8 -2.07 -0.38 -0.10 -0.48 202.58 0.85 9 -0.53 -0.09 -0.02 -0.11 -69.71 -0.29 10 4.87 0.84 -0.12 -0.62 147.40 0.59 11 -75.58* -13.48 0.17 0.87 -737.15 -3.17 12 4.84 0.93 -0.16 -0.79 143.75 0.64 13 -2.82 -0.51 0.21 1.03 0.37 0.00 14 1.22 0.22 -0.15 -0.74 308.65 1.30 15 -10.45 -1.90 0.05 0.25 -681.49 -2.87

\tbl

Box-Pierce test for wavelet-PAR residuals. Lags 20 30 p-value 0.3159 0.4438

8 Conclusion

PARMA models are used extensively in various fields like economics, climatology, signal processing and hydrology. However, the presence of a substantial number of parameters in a PARMA model reduces its efficiency. Fourier methods already exist in the literature for reducing the number of parameters and have found some success when the parameters vary slowly. However, transient events are quite common in real-life applications and wavelet techniques stand out as the principal analysis tool in these situations. The efficiency of wavelet methods in reducing the number of parameters of PARMA models has been demonstrated using a simulation study and their applicability in real life has been illustrated using real data. This study opens new avenues for the reduction of parameters in other classes of models containing a large number of parameters.

Acknowledgement

Rhea Davis acknowledges the financial support of the University Grants Commission (UGC), India under the Savitribai Jyotirao Phule Fellowship for Single Girl Child (SJSGC) scheme.

Disclosure statement

There is no conflict of interest between the authors.

References

[1] P.L. Anderson and M.M. Meerschaert, Modeling river flows with heavy tails, Water Resources Research 34 (1998), pp. 2271–2280.
[2] P.L. Anderson and M.M. Meerschaert, Parameter estimation for periodically stationary time series, Journal of Time Series Analysis 26 (2005), pp. 489–518.
[3] P.L. Anderson, M.M. Meerschaert, and A.V. Vecchia, Innovations algorithm for periodically stationary time series, Stochastic Processes and their Applications 83 (1999), pp. 149–169.
[4] P.L. Anderson, F. Sabzikar, and M.M. Meerschaert, Parsimonious time series modeling for high frequency climate data, Journal of Time Series Analysis 42 (2021), pp. 442–470.
[5] P.L. Anderson, Y.G. Tesfaye, and M.M. Meerschaert, Fourier-parma models and their application to river flows, Journal of Hydrologic Engineering 12 (2007), pp. 462–472.
[6] P. Anderson and A. Vecchia, Asymptotic results for periodic autoregressive moving-average processes, Journal of Time Series Analysis 14 (1993), pp. 1–18.
[7] P. Bloomfield, H.L. Hurd, and R.B. Lund, Periodic correlation in stratospheric ozone data, Journal of Time Series Analysis 15 (1994), pp. 127–150.
[8] P.J. Brockwell and R.A. Davis, Time series: theory and methods, Springer science & business media, 1991.
[9] P.H. Franses and R. Paap, Periodic time series models, OUP Oxford, 2004.
[10] W. Gardner and L. Franks, Characterization of cyclostationary random signal processes, IEEE Transactions on information theory 21 (1975), pp. 4–14.
[11] W. Härdle, G. Kerkyacharian, D. Picard, and A. Tsybakov, Wavelets, approximation, and statistical applications, Vol. 129, Springer Science & Business Media, 2012.
[12] D.C. Howell, Statistical methods for psychology, Cengage Learning, 2012.
[13] R.H. Jones and W.M. Brelsford, Time series with periodic structure, Biometrika 54 (1967), pp. 403–408.
[14] R. Lund and I. Basawa, Recursive prediction and likelihood evaluation for periodic arma models, Journal of Time Series Analysis 21 (2000), pp. 75–93.
[15] S. Mallat, A wavelet tour of signal processing, Elsevier, 1999.
[16] E.T. McGonigle, R. Killick, and M.A. Nunes, Trend locally stationary wavelet processes, Journal of Time Series Analysis 43 (2022), pp. 895–917.
[17] H.A. Mohammadi, S. Ghofrani, and A. Nikseresht, Using empirical wavelet transform and high-order fuzzy cognitive maps for time series forecasting, Applied Soft Computing 135 (2023), p. 109990.
[18] G.P. Nason, Wavelet methods in statistics with R, Springer, 2008.
[19] R.T. Ogden, Essential wavelets for statistical applications and data analysis, Springer, 1997.
[20] M. Pagano, On periodic and multiple autoregressions, The Annals of Statistics (1978), pp. 1310–1317.
[21] E. Parzen and M. Pagano, An approach to modeling seasonally stationary time series, Journal of Econometrics 9 (1979), pp. 137–153.
[22] D.B. Percival and A.T. Walden, Wavelet methods for time series analysis, Cambridge University Press, 2000.
[23] Y.G. Tesfaye, P.L. Anderson, and M.M. Meerschaert, Asymptotic results for fourier-parma time series, Journal of Time Series Analysis 32 (2011), pp. 157–174.
[24] A. Vecchia, Maximum likelihood estimation for periodic autoregressive moving average models, Technometrics (1985), pp. 375–384.
[25] A. Vecchia, Periodic autoregressive-moving average (parma) modeling with applications to water resources 1, JAWRA Journal of the American Water Resources Association 21 (1985), pp. 721–730.
[26] B. Vidakovic, Statistical modeling by wavelets, John Wiley & Sons, 1999.
[27] J. Wang, S. Shao, Y. Bai, J. Deng, and Y. Lin, Multiscale wavelet graph autoencoder for multivariate time-series anomaly detection, IEEE Transactions on Instrumentation and Measurement 72 (2022), pp. 1–11.

9 Appendices

In this section, we state the results used to analyse our models, which were proved under the following assumptions:

Let $\{Y_{t}\}$ be a process defined by (1). Then,

B1.

Finite Fourth moment: $E\varepsilon_{t}^{4}=\eta<\infty$ .
B2.

The model admits a causal representation

$Y_{t}=\sum_{j=0}^{\infty}\psi_{t}(j)\varepsilon_{t-j},$

where $\psi_{t}(0)=1$ and $\sum_{j=0}^{\infty}\left|\psi_{t}(j)\right|<\infty$ for all $t$ . Note that $\psi_{t}(j)=\psi_{t+k\nu}(j)$ for all $j$ . Also, $Y_{t}=\tilde{Y}_{t}-\mu_{t}$ and $\varepsilon_{t}=\sigma_{t}\delta_{t}$ , where $\{\delta_{t}\}$ is i.i.d.
B3.

The model satisfies an invertibility condition

$\varepsilon_{t}=\sum_{j=0}^{\infty}\pi_{t}(j)Y_{t-j},$

where $\pi_{t}(0)=1$ and $\sum_{j=0}^{\infty}\left|\pi_{t}(j)\right|<\infty$ for all $t$ . Again, $\pi_{t}(j)=\pi_{t+k\nu}(j)$ for all $j$ .
B4.

The spectral density matrix $f(\lambda)$ of the equivalent vector ARMA process given by,

$U_{t}=\sum_{j=-\infty}^{\infty}\Psi_{j}Z_{t-j},$ (40)

where $U_{t}=(Y_{t\nu},\dotsc,Y_{(t+1)\nu-1})^{\prime}$ , $Z_{t}=(\varepsilon_{t\nu},\dotsc,\varepsilon_{(t+1)\nu-1})^{\prime}$ and $\Psi_{t}$ is the $\nu\times\nu$ matrix with $ij$ entry $\psi_{i}(t\nu+i-j)$ and for some $0<c\leq C<\infty$ we have,

$cz^{\prime}z\leq z^{\prime}f(\lambda)z\leq Cz^{\prime}z,\quad-\pi\leq\lambda\leq\pi,$

for all $z$ in $R^{\nu}$ .
B5.

The number of iterations $n$ of the iterations algorithm satisfies $n\leq Nv-1$ , and $n^{2}/N\rightarrow 0$ as $N\rightarrow\infty$ and $n\rightarrow\infty$ .
B6.

The number of iterations $n$ of the iterations algorithm satisfies $n\leq Nv-1$ , and $n^{3}/N\rightarrow 0$ as $N\rightarrow\infty$ and $n\rightarrow\infty$ .

Theorem A1 (Innovation Algorithm for Periodically Stationary Processes, see [3]).

If $\left\{Y_{t}\right\}$ has zero mean and $E\left(Y_{\ell}Y_{m}\right)=\gamma_{\ell}(m-\ell)$ , where the ma $\operatorname{trix}\Gamma_{n,i}=\left[\gamma_{i+n-1-\ell}(\ell-m)\right]_{\ell,m=0,\ldots,n-1},i=0,\ldots,v-1$ , is nonsingular for each $n\geqslant 1$ , then the one-step predictors $\hat{Y}_{i+n},n\geqslant 0$ , and their mean-square errors $v_{n,i},n\geqslant 1$ , are given by

\hat{Y}_{i+n}=\begin{cases}0&\text{ if }n=0\\ \sum\limits_{j=1}^{n}\theta_{n,j}^{(i)}\left(Y_{i+n-j}-\hat{Y}_{i+n-j}\right)&\text{ if }n\geqslant 1\end{cases},

and for $k=0,1,\ldots,n-1,$

$v_{0,i}=\gamma_{i}(0),$

\theta_{n,n-k}^{(i)}=\left(v_{k,i}\right)^{-1}\left[\gamma_{i+k}(n-k)-\sum_{j=0}^{k-1}\theta_{k,k-j}^{(i)}\theta_{n,n-j}^{(i)}v_{j,i}\right],

(41)

v_{n,i}=\gamma_{i+n}(0)-\sum_{j=0}^{n-1}\left(\theta_{n,n-j}^{(i)}\right)^{2}v_{j,i}.

(42)

where (41) is solved recursively in the order $v_{0,i};\theta_{1,1}^{(i)},v_{1,i};\theta_{2,2}^{(i)},\theta_{2,1}^{(i)},v_{2,i};\theta_{3,3}^{(i)},\theta_{3,2}^{(i)},\theta_{3,1}^{(i)}$ , $v_{3,i},\ldots$ .
The estimates of $\theta_{n,j}^{(i)}$ and $v_{n,i}$ , $\hat{\theta}_{n,j}^{(i)}$ and $\hat{v}_{n,i}$ respectively, can be obtained by substituting the estimates of autocovariance function (3).

Theorem A2 (see [3]).

Let $\tilde{Y}_{t}$ be a process as defined in (1). Under the assumptions B1-B5, we have,

\hat{\theta}_{n,j}^{(\langle i-n\rangle)}\xrightarrow{p}\psi_{i}(j),\quad\text{as}\,n\rightarrow\infty,\quad\text{for all}\,j,\,i=0,1,\dotsc,\nu-1.

(43)

Here the notation, $\langle b\rangle$ is $b-\nu\lfloor b/{\nu}\rfloor\,\text{for}\,b=0,1,\dotsc$ and $\nu+b-\nu\lfloor b/{\nu}+1\rfloor$ for $b=-1,-2,\dotsc$ , where $\lfloor\cdot\rfloor$ is the greatest integer function.

Theorem A3 (see [3]).

Let $\tilde{Y}_{t}$ be a process as defined in (1). Under the assumptions B1-B5, we have,

\hat{v}_{n,\langle i-n\rangle}\xrightarrow{p}\sigma_{i}^{2},\quad\text{as}\,n\rightarrow\infty,\,i=0,1,\dotsc,\nu-1.

(44)

Theorem A4 (see [23]).

Let $\{Y_{t}\}$ be a $\text{PAR}_{\nu}(1)$ as given in (4) having causal representation B2. Then, $\{\phi_{0},\phi_{1},\dotsc,\phi_{\nu-1}\}^{\prime}=\{\psi_{0}(1),\psi_{1}(1),\dotsc,\psi_{\nu-1}\}^{\prime}$ .

Theorem A5 (see [2], [23]).

Let $\{Y_{t}\}$ be a $\text{PAR}_{\nu}(1)$ as given in (4). Under the assumptions B1-B4 and B6, and using Theorem A4, we have,

N^{1/2}(\hat{\phi}_{i}-\phi_{i})\Rightarrow\mathcal{N}(0,\sigma_{i-1}^{-2}\sigma_{i}^{2}),\quad i=0,1,\dotsc,\nu-1.

(45)

Theorem A6 (see [4]).

Let $\tilde{Y}_{t}$ be a process as defined in (1) having causal representation B2. Then,

N^{1/2}(\boldsymbol{\hat{\mu}}-\boldsymbol{\mu})\Rightarrow\mathcal{N}(0,\Sigma_{\boldsymbol{\mu}}),

(46)

where $\Sigma_{\boldsymbol{\mu}}=\sum_{n=\infty}^{\infty}B_{n}\Pi^{n},B_{n}=\text{diag}\,(\gamma_{0}(n),\gamma_{1}(n),\dotsc,\gamma_{\nu-1}(n)),$ and $\Pi$ is as given in (35).

Theorem A7 (see [4]).

Let $\tilde{Y}_{t}$ be a process as defined in (1). Under assumptions B1-B5, and if,

N^{3/4}\sum_{l>n}|\pi_{i}(l)|\rightarrow 0\,\text{as}\,N\rightarrow\infty\,\text{and}\,n\rightarrow\infty,

(47)

we have,

\begin{pmatrix}\hat{v}_{n,\langle 0-n\rangle}\\ \hat{v}_{n,\langle 1-n\rangle}\\ \vdots\\ \hat{v}_{n,\langle\nu-1-n\rangle}\end{pmatrix}\,\text{is}\,\text{AN}\left[\begin{pmatrix}\sigma_{0}^{2}\\ \sigma_{1}^{2}\\ \vdots\\ \sigma_{\nu-1}^{2}\end{pmatrix},N^{-1}\begin{pmatrix}s_{0,0}&\dotsc&s_{0,\nu-1}\\ \vdots&\cdots&\vdots\\ s_{\nu-1,0}&\dotsc&s_{\nu-1,\nu-1}\end{pmatrix}\right],

(48)

where $s_{i,j}$ is given by,

s_{i,j}=\sum_{m_{1}=0}^{\infty}\sum_{m_{2}=0}^{\infty}\pi_{i}(m_{1})\pi_{j}(m_{2})(W_{m_{1}m_{2}})_{i-m_{1},j-m_{2}},

and $(W_{m_{1},m_{2}})_{i,j}$ is given by,

	$\displaystyle\left(W_{m_{1}m_{2}}\right)_{i,j}=$	$\displaystyle(\eta-3)\sum_{j_{1}=-\infty}^{\infty}\psi_{i}\left(j_{1}\right)\psi_{i+m_{1}}\left(j_{1}+m_{1}\right)$
		$\displaystyle\cdot\sum_{n=-\infty}^{\infty}\psi_{j}\left(j_{1}+n\nu+j-i\right)\psi_{j+m_{2}}\left(j_{1}+n\nu+j-i+m_{2}\right)\sigma_{i-j_{1}}^{4}$
		$\displaystyle+\sum_{n=-\infty}^{\infty}\left[\gamma_{i}(n\nu+j-i)\gamma_{i+m_{1}}\left(n\nu+j-i-m_{1}+m_{2}\right)\right.$
		$\displaystyle\left.+\gamma_{i}\left(n\nu+j-i+m_{2}\right)\gamma_{i+m_{1}}\left(n\nu+j-i-m_{1}\right)\right],\,\text{for all}$

with $\eta=E(\delta_{t}^{4})$ and $\{\delta_{t}\}=\{\sigma_{t}^{-1}\epsilon_{t}\}$ .

Theorem A8 (see [23]).

Let $\tilde{Y}_{t}$ be PARMA(1,1) process as defined in (5) having causal representation B2. Then,

\phi_{t}=\frac{\psi_{t}(2)}{\psi_{t-1}(1)}.

(49)

\theta_{t}=\phi_{t}-\psi_{t}(1).

(50)

Theorem A9 (see [23]).

Let $\tilde{Y}_{t}$ be the PARMA(1,1) process as defined in (5). Using assumptions B1-B4 and B6, we have,

N^{(1/2)}(\hat{\boldsymbol{\phi}}-\boldsymbol{\phi})\implies\mathcal{N}(0,\mathcal{Q}),

(51)

where $\boldsymbol{\hat{\phi}}=[\hat{\phi}_{0},\dotsc,\hat{\phi}_{\nu-1}]^{\prime},\boldsymbol{\phi}=[\phi_{0},\dotsc,\phi_{\nu-1}]^{\prime}$ and $\nu\times\nu$ matrix $\mathcal{Q}$ is defined by

\mathcal{Q}=\sum_{k,l=1}^{2}\mathcal{H}_{\ell}\mathcal{V}_{\ell k}\mathcal{H}_{k}^{\prime},

(52)

where

\mathcal{V}_{\ell k}=\sum_{j=1}^{x}(\{\mathcal{F}_{\ell-j}\Pi^{\ell-j})\mathcal{B}_{j}(\mathcal{F}_{k-j}\Pi^{-(k-j)})^{\prime}\},\,x=min(\ell,k),

(53)

\mathcal{F}_{j}=\text{diag}\{\psi_{0}(j),\psi_{1}(j),\dotsc,\psi_{\nu-1}(j)\},

(54)

\mathcal{B}_{j}=\text{diag}\{\sigma_{0}^{2}\sigma_{0-j}^{2},\sigma_{1}^{2}\sigma_{1-j}^{2}\dotsc,\sigma_{\nu-1}^{2}\sigma_{\nu-1-j}^{2}\},

(55)

$\mathcal{H}_{1}=-\mathcal{F}_{2}\Pi^{-1}\mathcal{F}_{1}^{-2}$ , $\mathcal{H}_{2}=\Pi^{-1}\mathcal{F}_{1}^{-1}\Pi$ and $\Pi$ is as defined in (35).

Theorem A10 (see [23]).

Let $\tilde{Y}_{t}$ be the PARMA(1,1) process as defined in (5). Using assumptions B1-B4 and B6, we have,

N^{(1/2)}(\hat{\boldsymbol{\theta}}-\boldsymbol{\theta})\implies\mathcal{N}(0,\mathcal{S}),

(56)

where $\boldsymbol{\hat{\theta}}=[\hat{\theta}_{0},\dotsc,\hat{\theta}_{\nu-1}]^{\prime},\boldsymbol{\theta}=[\theta_{0},\dotsc,\theta_{\nu-1}]^{\prime}$ and $\nu\times\nu$ matrix $\mathcal{S}$ is defined by

\mathcal{S}=\sum_{k,l=1}^{2}\mathcal{M}_{\ell}\mathcal{V}_{\ell k}\mathcal{M}_{k}^{\prime},

(57)

where $\mathcal{M}_{1}=\mathcal{I}-\mathcal{F}_{2}\Pi^{-1}\mathcal{F}_{1}^{-2}$ , $\mathcal{M}_{2}=\Pi^{-1}\mathcal{F}_{1}^{-1}\Pi,$ $\mathcal{V}_{\ell k}$ is given in (53) and $\Pi$ is as given in (35). Here, $\mathcal{I}$ is the $\nu\times\nu$ identity matrix.

Theorem A11 (see [4]).

For a given vector $\mathbf{X}$ , the periodic function $\mathbf{X}=[X_{t}:0\leq t\leq\nu-1]$ has the Fourier representation

X_{t}=c_{0}+\sum_{r=1}^{\ell}\left\{c_{r}cos\left(\frac{2\pi rt}{\nu}\right)+s_{r}sin\left(\frac{2\pi rt}{\nu}\right)\right\},

(58)

where $c_{r}$ and $s_{r}$ are Fourier coefficients and

\ell=\begin{cases}\nu/2\quad\text{if}\,\nu\,even\\ (\nu-1)/2\quad\text{if}\,\nu\,odd\end{cases}.

Let $f=\begin{cases}[c_{0},c_{1},s_{1},\dotsc,c_{(\nu-1)/2},s_{(\nu-1)/2}]^{\prime}\quad(\nu\,\text{odd})\\ [c_{0},c_{1},s_{1},\dotsc,s_{(\nu/2-1)},c_{(\nu/2)}]^{\prime}\quad(\nu\,\text{even})\end{cases}$ .

\text{Then,}\quad f=\mathcal{L}\mathcal{P}\mathcal{U}\mathbf{X},

(59)

where $\mathcal{L}$ and $\mathcal{U}$ are respectively given by,

\mathcal{L}=\begin{cases}\text{diag}(\nu^{-1/2},\sqrt{2/\nu},\dotsc,\sqrt{2/\nu})\quad\quad\quad\quad(\nu\,\text{odd})\\ \text{diag}(\nu^{-1/2},\sqrt{2/\nu},\dotsc,\sqrt{2/\nu},\nu^{-1/2})\quad(\nu\,\text{even})\end{cases},

(60)

and

\mathcal{U}=\nu^{-1/2}\left(e^{\frac{-i2\pi rt}{\nu}}\right).

(61)

The $(\ell,j)$ th element of the $\mathcal{P}$ is given by,

[\mathcal{P}]_{\ell j}=\begin{cases}1&\text{ if }\ell=j=0;\\ 2^{-1/2}&\text{ if }\ell=2r-1\text{ and }j=r\text{ for some }1\leq r\leq[(v-1)/2]\\ 2^{-1/2}&\text{ if }\ell=2r-1\text{ and }j=v-r\text{ for some }1\leq k\leq\lfloor(v-1)/2\rfloor\\ i2^{-1/2}&\text{ if }\ell=2r\text{ and }j=r\text{ for some }1\leq r\leq[(v-1)/2];\\ -i2^{-1/2}&\text{ if }\ell=2r\text{ and }j=v-r\text{ for some }1\leq k\leq\lfloor(v-1)/2\rfloor;\\ 1&\text{ if }\ell=v-1\text{ and }j=v/2\text{ and }v\text{ is even; and }\\ 0&\text{ otherwise. }\end{cases}.

(62)

Theorem A12 (see [8]).

If $\mathbf{X}_{n}$ is a $k\times 1$ vector satisfying,

\mathbf{X}_{n}\Rightarrow\mathcal{N}(\boldsymbol{\mu_{n}},\Sigma_{n}),

(63)

and $\mathcal{B}$ is any non-zero $m\times k$ matrix such that the matrices $\mathcal{B}\Sigma_{n}\mathcal{B}^{\prime}$ have no zero diagonal elements then,

\mathcal{B}\mathbf{X}_{n}\Rightarrow\mathcal{N}(\mathcal{B}\boldsymbol{\mu}_{n},\mathcal{B}\Sigma_{n}\mathcal{B}^{\prime}).

(64)

Wavelet Based Periodic Autoregressive Moving Average Models

Abstract

keywords:

1 Introduction

2 Estimation of PARMA parameters

2.1 Periodic autoregressive model of order 1 (PAR(1))

2.2 PARMA(1,1)

3 Fourier-PARMA model

4 Wavelet Analysis

5 Wavelet - PARMA model

5.1 PAR(1) under H0H_{0}

5.2 PARMA(1,1) under H0H_{0}

6 Simulation of wavelet-PARMA

7 Data Analysis

8 Conclusion

Acknowledgement

Disclosure statement

References

9 Appendices

Theorem A1 (Innovation Algorithm for Periodically Stationary Processes, see [3]).

Theorem A2 (see [3]).

Theorem A3 (see [3]).

Theorem A4 (see [23]).

Theorem A5 (see [2], [23]).

Theorem A6 (see [4]).

Theorem A7 (see [4]).

Theorem A8 (see [23]).

Theorem A9 (see [23]).

Theorem A10 (see [23]).

Theorem A11 (see [4]).

Theorem A12 (see [8]).

5.1 PAR(1) under $H_{0}$

5.2 PARMA(1,1) under $H_{0}$