This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantiled conditional variance, skewness, and kurtosis by Cornish-Fisher expansion

Ningning Zhang    Ke Zhu University of Hong Kong
Abstract

The conditional variance, skewness, and kurtosis play a central role in time series analysis. These three conditional moments (CMs) are often studied by some parametric models but with two big issues: the risk of model mis-specification and the instability of model estimation. To avoid the above two issues, this paper proposes a novel method to estimate these three CMs by the so-called quantiled CMs (QCMs). The QCM method first adopts the idea of Cornish-Fisher expansion to construct a linear regression model, based on nn different estimated conditional quantiles. Next, it computes the QCMs simply and simultaneously by using the ordinary least squares estimator of this regression model, without any prior estimation of the conditional mean. Under certain conditions, the QCMs are shown to be consistent with the convergence rate n1/2n^{-1/2}. Simulation studies indicate that the QCMs perform well under different scenarios of Cornish-Fisher expansion errors and quantile estimation errors. In the application, the study of QCMs for three exchange rates demonstrates the effectiveness of financial rescue plans during the COVID-19 pandemic outbreak, and suggests that the existing “news impact curve” functions for the conditional skewness and kurtosis may not be suitable.

Conditional moments; Cornish-Fisher expansion; News impact curve; Quantile time series estimation; Quantiled conditional moments.,
keywords:
\startlocaldefs\endlocaldefs

and

t1Address correspondence to Ke Zhu: Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong. E-mail: mazhuke@hku.hk

1 Introduction

Learning the conditional variance, skewness, and kurtosis of a univariate time series is the core issue in many financial and economic applications. The classical tools to study the conditional variance are the generalized autoregressive conditional heterosecdasticity (GARCH) model and its variants. See Engle (1982), Bollerslev (1986), Francq and Zakoïan (2019), and references therein. However, except for some theoretical works on parameter estimation in Escanciano (2009) and Francq and Thieu (2019), the GARCH-type models commonly assume independent and identically distributed (i.i.d.) innovations, resulting in the constant conditional skewness and kurtosis. As argued by Samuelson (1970) and Rubinstein (1973), the higher moments like skewness and kurtosis are nonnegligible, since they are not only exemplary evidence of the non-normal returns but also relevant to the investor’s optimal decision. Along this way, a large body of literature has demonstrated the importance of conditional skewness and kurtosis in portfolio selection (Chunhachinda et al., 1997), asset pricing (Harvey and Siddique, 2000), risk management (Bali et al., 2008), return predictability (Jondeau et al., 2019), and many others. These empirical successes indicate the necessity to learn the dynamic structures of the conditional skewness and kurtosis simultaneously with the conditional variance.

Although there is a large number of studies on the conditional variance, only a few of them have taken account of the conditional skewness and kurtosis. The pioneer works towards this goal are Hansen (1994) and Harvey and Siddique (1999), followed by Jondeau and Rockinger (2003), Brooks et al. (2005), León et al. (2005), Grigoletto and Lisi (2009), and León and N̄íguez (2020). All of these works assume a peculiar conditional distribution on the innovations of GARCH-type model, where the conditional skewness or kurtosis either directly has an analogous GARCH-type dynamic structure rooting in re-scaled shocks or indirectly depends on dynamic structure of distribution parameters. See also Francq and Sucarrat (2022) and Sucarrat and Grønneberg (2022) for a different investigation of conditional skewness and kurtosis via the GARCH-type model with a time-varying probability of zero returns. However, the aforementioned parametric methods have two major shortcomings: First, they inevitably have the risk of using wrongly specified parametric models or innovation distributions; Second, they usually have unstable model estimation results in the presence of dynamic structure of skewness and kurtosis.

This paper proposes a new novel method to simultaneously learn the conditional variance, skewness, and kurtosis by the so-called quantiled conditional variance, skewness, and kurtosis, respectively. Our three quantiled conditional moments (QCMs) are formed based on the spirit of the Cornish-Fisher expansion (Cornish and Fisher, 1938), which exhibits a fundamental relationship between the conditional quantiles and CMs. By replacing the unknown conditional quantiles with their estimators at nn quantile levels, the QCMs (with respect to variance, skewness, and kurtosis) are simply computed at each fixed timepoint by using the ordinary least squares (OLS) estimator of a linear regression model, which stems naturally from the Cornish-Fisher expansion. Surprisingly, our way to compute the QCMs does not require any estimator of the conditional mean. The precision of the QCMs is controlled by the errors of the proposed linear regression model, which comprises two components: First, the expansion error encompasses higher-order conditional moments in the Cornish-Fisher expansion that are not taken as the regressors in the linear regression model; Second, the approximation error arises from the use of estimated conditional quantiles (ECQs). Under certain conditions on the regression model error, we show that the QCMs are consistent estimators of the corresponding CMs with the convergence rate n1/2n^{-1/2}. Simulation studies reveal that, when considering various scenarios of approximation errors caused by the biased ECQs from the use of contaminated or mis-specified conditional quantile models, (i) the quantiled conditional variance and skewness exhibit robust and satisfactory performance, regardless of the non-negligibility of expansion error; (ii) the quantiled conditional kurtosis has a larger dispersion for heavier-tailed data with non-negligible expansion error, which is unavoidable due to the less accuracy of CF expansion on the heavier-tailed distributions (Lee and Lin, 1992).

In the application, we study the QCMs of return series for three exchange rates. During the COVID-19 pandemic outbreak in March 2020, we find that the values of quantiled conditional variance and kurtosis increased rapidly, and the values of quantiled conditional skewness decreased sharply before March 19 or 20 in all of examined exchange rates, shedding light on the worldwide perilous financial crisis at that time. After March 19 or 20, we find that the values of quantiled conditional variance, skewness, and kurtosis exhibited totally opposite trends, so demonstrating the effectiveness of financial rescue plans issued by governments. Moreover, since the existing parametric forms of “news impact curve” (NIC) functions for the CMs are chosen in an ad-hoc way (Engle and Ng, 1993; Harvey and Siddique, 1999; León et al., 2005), we give a data-driven method to scrutinize the parametric forms of the NIC functions by using the QCMs. Our findings suggest that the parametric forms of NIC functions for conditional variance are appropriate, while those for conditional skewness and kurtosis may be unsuitable.

It is worth noting that our QCM method essentially transforms the problem of CM estimation to that of conditional quantile estimation. This brings us two major advantages over the aforementioned parametric methods, although we need do the conditional quantile estimation nn different times to implement the QCM method, and could face the risk of getting inaccurate estimate of conditional kurtosis for the very heavy-tailed data.

First, the QCM method can largely reduce the risk of model mis-specification, since the QCMs are simultaneously computed without any prior estimation of the conditional mean, and their consistency holds even when the specifications of conditional quantiles are mis-specified. This advantage is attractive and unexpected, since we usually have to estimate conditional mean, variance, skewness (or kurtosis) successively via some correctly specified parametric models. The reason of this advantage is that the QCM method is regression-based. Specifically, the conditional mean formally becomes one part of the intercept parameter, so it has no impact on the QCMs that are computed only from the OLS estimator of all non-intercept parameters; meanwhile, the impact of biased ECQs from the use of wrongly specified conditional quantile models can be aggregately offset by another part of the intercept parameter, ensuring the consistency of QCMs to a large extent. In a sense, without specifying any parametric forms of CMs, this important feature allows us to view the QCMs as the “observed” CMs, and consequently, many intriguing but hardly implemented empirical studies could become tractable based on the QCMs (see, e.g., our empirical studies on NICs for the CMs).

Second, the QCM method can numerically deliver more stable estimators of the CMs than the parametric methods. As shown in Jondeau and Rockinger (2003), there exists a moment issue that places a necessary nonlinear constraint on the conditional skewness and kurtosis, leading to a complex restriction on the admission region of model parameters. This restriction not only raises the computational burden of parameter estimation but also makes the estimation result unstable, so it has been rarely considered in the existing parametric methods. In contrast, the QCM method directly computes the QCMs at each fixed timepoint, and this interesting feature ensures that the nonlinear constraint on the conditional skewness and kurtosis can be simply examined using the computed QCMs at each timepoint. Particularly, if this nonlinear constraint is violated at some timepoints, it is straightforward to replace the OLS estimator with a constrained least squares estimator to propose the QCMs which then satisfy the constraint automatically.

The remaining paper is organized as follows. Section 2 proposes the QCMs based on the linear regression, and discusses the issues of conditional mean and moment constraints. Section 3 establishes the asymptotics of the QCMs. Section 4 provides the practical implementations of the QCMs. Simulation studies are given in Section 5. An application to study the QCMs for three exchange rates and their related NICs is offered in Section 6. Concluding remarks are presented in Section 7. Proofs and some additional simulation results are deferred into the supplementary materials.

2 Quantiled Conditional Moments

2.1 Definition

Let {y1,,yT}\{y_{1},...,y_{T}\} be a time series of interest with length TT, and tσ(ys;st)\mathcal{F}_{t}\equiv\sigma(y_{s};s\leqslant t) be its available information set up to time tt. Given t1\mathcal{F}_{t-1}, the conditional mean, variance, skewness, and kurtosis of yty_{t} at timepoint tt are defined as μt=E(yt|t1)\mu_{t}=E(y_{t}|\mathcal{F}_{t-1}) and

ht=E[(ytμt)2|t1], st=E((ytμtht)3|t1), kt=E((ytμtht)4|t1),\displaystyle h_{t}=E[(y_{t}-\mu_{t})^{2}|\mathcal{F}_{t-1}],\text{ }s_{t}=E\Big{(}\Big{(}\frac{y_{t}-\mu_{t}}{\sqrt{h_{t}}}\Big{)}^{3}|\mathcal{F}_{t-1}\Big{)},\text{ }k_{t}=E\Big{(}\Big{(}\frac{y_{t}-\mu_{t}}{\sqrt{h_{t}}}\Big{)}^{4}|\mathcal{F}_{t-1}\Big{)}, (2.1)

respectively. Below, we show how to estimate these three conditional moments in (2.1) by using the Cornish-Fisher expansion (Cornish and Fisher, 1938) at a fixed timepoint tt.

Let Qt(α)Q_{t}(\alpha) be the conditional quantile of yty_{t} at the quantile level α(0,1)\alpha\in(0,1). According to the Cornish-Fisher expansion, we have

Qt(α)=μt+ht[x+(x21)st6+(x33x)kt324+rt(α)],Q_{t}(\alpha)=\mu_{t}+\sqrt{h_{t}}\Big{[}x+(x^{2}-1)\frac{s_{t}}{6}+(x^{3}-3x)\frac{k_{t}-3}{24}+r_{t}(\alpha)\Big{]}, (2.2)

where x=Φ1(α)x=\Phi^{-1}(\alpha) with Φ()\Phi(\cdot) being the distribution function of N(0,1)N(0,1), and rt(α)r_{t}(\alpha) contains all remaining terms on the higher-order conditional moments. Taking nn quantile levels αi\alpha_{i}, i=1,,ni=1,...,n, equation (2.2) entails the following regression model with deterministic explanatory variables 𝑿i\bm{X}_{i} but random coefficients 𝜷t\bm{\beta}_{t}:

Yt,i=μt+𝑿i𝜷t+εt,i,i=1,,n,\displaystyle Y_{t,i}^{\ast}=\mu_{t}+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}^{\ast},\,\,i=1,...,n, (2.3)

where Yt,i=Qt(αi)Y_{t,i}^{\ast}=Q_{t}(\alpha_{i}), 𝑿i=(xi,xi21,xi33xi)\bm{X}_{i}=(x_{i},x_{i}^{2}-1,x_{i}^{3}-3x_{i})^{\prime} with xi=Φ1(αi)x_{i}=\Phi^{-1}(\alpha_{i}),

𝜷t(β1t,β2t,β3t)=(ht,htst6,ht(kt3)24),\displaystyle\bm{\beta}_{t}\equiv(\beta_{1t},\beta_{2t},\beta_{3t})^{\prime}=\Big{(}\sqrt{h_{t}},\frac{\sqrt{h_{t}}s_{t}}{6},\frac{\sqrt{h_{t}}(k_{t}-3)}{24}\Big{)}^{\prime}, (2.4)

and εt,i=htrt(αi)\varepsilon_{t,i}^{\ast}=\sqrt{h_{t}}r_{t}(\alpha_{i}). We call εt,i\varepsilon_{t,i}^{\ast} the expansion error, since it comes from the Cornish-Fisher expansion but can not be explained by 𝑿i\bm{X}_{i} adequately.

Next, we aim to obtain the estimators of hth_{t}, sts_{t}, and ktk_{t} through the estimator of 𝜷t\bm{\beta}_{t} in (2.4). To achieve this goal, we replace the unobserved Yt,iY_{t,i}^{\ast} with its estimator Yt,iY_{t,i}, and then rewrite model (2.3) as follows:

Yt,i=μt+𝑿i𝜷t+εt,i,i=1,,n,\displaystyle Y_{t,i}=\mu_{t}+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}^{\bullet},\,\,i=1,...,n, (2.5)

where Yt,i=Q^t(αi)Y_{t,i}=\widehat{Q}_{t}(\alpha_{i}) with Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) being an estimator of Qt(αi)Q_{t}(\alpha_{i}), and εt,i=εt,i+εt,i\varepsilon_{t,i}^{\bullet}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ} with εt,i=Q^t(αi)Qt(αi)\varepsilon_{t,i}^{\circ}=\widehat{Q}_{t}(\alpha_{i})-Q_{t}(\alpha_{i}). Clearly, εt,i\varepsilon_{t,i}^{\circ} quantifies the error caused by using Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) to approximate Qt(αi)Q_{t}(\alpha_{i}), so it can be termed as the approximation error. Consequently, εt,i\varepsilon_{t,i}^{\bullet} considering the total number of εt,i\varepsilon_{t,i}^{\ast} and εt,i\varepsilon_{t,i}^{\circ} can be viewed as the gross error. We should mention that any two quantile levels αi\alpha_{i} and αj\alpha_{j} in (2.5) are allowed to be the same, as long as Yt,iY_{t,i} and Yt,jY_{t,j} are different due to the use of two different conditional quantile estimation methods. In other words, model (2.5) allows us to simply pool different information of conditional quantiles from different channels of estimation methods at any fixed quantile level.

Although εt,i\varepsilon_{t,i}^{\bullet} is expected to have values oscillating around zero, it may not always have mean zero. Therefore, for the purpose of identification, we add a deterministic term γt\gamma_{t} into model (2.5) to form the following regression model:

Yt,i\displaystyle Y_{t,i} =(μt+γt)+𝑿i𝜷t+εt,i𝒁i𝜽t+εt,i,i=1,,n,\displaystyle=(\mu_{t}+\gamma_{t})+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}\equiv\bm{Z}_{i}^{\prime}\bm{\theta}_{t}+\varepsilon_{t,i},\,\,i=1,...,n, (2.6)

where εt,i=εt,iγt\varepsilon_{t,i}=\varepsilon_{t,i}^{\bullet}-\gamma_{t}, 𝒁i=(1,𝑿i)\bm{Z}_{i}=(1,\bm{X}_{i}^{\prime})^{\prime}, and 𝜽t=(β0t,𝜷t)\bm{\theta}_{t}=(\beta_{0t},\bm{\beta}_{t}^{\prime})^{\prime} with the intercept parameter β0t=μt+γt\beta_{0t}=\mu_{t}+\gamma_{t}.

Let 𝒀t\bm{Y}_{t} be an n×1n\times 1 vector with entries Yt,iY_{t,i}, 𝒁\bm{Z} be an n×4n\times 4 matrix with rows 𝒁i\bm{Z}_{i}^{\prime}, and 𝜺t\bm{\varepsilon}_{t} be an n×1n\times 1 vector with entries εt,i\varepsilon_{t,i}. Then, the ordinary least squares (OLS) estimator of 𝜽t\bm{\theta}_{t} in (2.6) is

𝜽^t(β^0t,𝜷^t)=(𝒁𝒁)1𝒁𝒀t.\displaystyle\widehat{\bm{\theta}}_{t}\equiv(\widehat{\beta}_{0t},\widehat{\bm{\beta}}_{t}^{\prime})^{\prime}=(\bm{Z}^{\prime}\bm{Z})^{-1}\bm{Z}^{\prime}\bm{Y}_{t}. (2.7)

According to (2.4), we rationally use 𝜷^t(β^1t,β^2t,β^3t)\widehat{\bm{\beta}}_{t}\equiv(\widehat{\beta}_{1t},\widehat{\beta}_{2t},\widehat{\beta}_{3t})^{\prime} in (2.7) to propose the estimators h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} for hth_{t}, sts_{t}, and ktk_{t}, respectively, where

h^t=β^1t 2,s^t=6β^2tβ^1t, and k^t=24β^3tβ^1t+3.\widehat{h}_{t}=\widehat{\beta}_{1t}^{\,2},\,\,\widehat{s}_{t}=\frac{6\widehat{\beta}_{2t}}{\widehat{\beta}_{1t}},\text{ and }\widehat{k}_{t}=\frac{24\widehat{\beta}_{3t}}{\widehat{\beta}_{1t}}+3. (2.8)

We call h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} the quantiled conditional variance, skewness, and kurtosis of yty_{t}, since they are estimators of hth_{t}, sts_{t}, and ktk_{t}, by using the estimated conditional quantiles (ECQs) of yty_{t}. Clearly, provided nn different ECQs (that is, nn different Yt,iY_{t,i} in (2.6)), our three quantiled conditional moments (QCMs) in (2.8) are easy-to-implement, since their computation only relies on the OLS estimator 𝜽^t\widehat{\bm{\theta}}_{t}.

2.2 The conditional mean issue

Using β^0t\widehat{\beta}_{0t} in (2.7), we can estimate β0t\beta_{0t} but not μt\mu_{t} due to the presence of γt\gamma_{t}. Hence, we are unable to form the quantiled conditional mean of yty_{t} to estimate μt\mu_{t}. Interestingly, our way to compute h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} does not require an estimator of μt\mu_{t}. This is far beyond our expectations, since normally we have to first estimate (or model) μt\mu_{t} and then hth_{t}, sts_{t}, and ktk_{t}.

Although the estimation of μt\mu_{t} is not required to compute h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t}, knowing the dynamic structure of μt\mu_{t} is still important in practice. Note that μt\mu_{t} is often assumed to be an unknown constant in accordance with the efficient market hypothesis, and this constant assumption can be examined by the consistent spectral tests for the martingale difference hypothesis (MDH) in Escanciano and Velasco (2006). If the constant assumption is rejected by these tests, the dynamic structure of μt\mu_{t} manifests and is usually specified by a linear model (e.g., the autoregressive moving-average model) or a nonlinear model (e.g., the threshold autoregressive model); see Fan and Yao (2003) and Tsay (2005) for surveys. In this case, the model correctly specifies the dynamic structure of μt\mu_{t} if and only if its model error is an MD sequence, a statement which can be consistently checked by two spectral tests for the MDH on unobserved model errors in Escanciano (2006). Hence, it is usually tractable for practitioners to come up with a valid parametric model for μt\mu_{t} in most of applications.

2.3 The moment constraints issue

Note that μt\mu_{t}, hth_{t}, sts_{t}, and ktk_{t} can be expressed in terms of the first four non-central moments m1tm_{1t}, m2tm_{2t}, m3tm_{3t}, and m4tm_{4t} of YtY_{t}, where mjt=E(Ytj|t1)m_{jt}=E(Y_{t}^{j}|\mathcal{F}_{t-1}). Therefore, the existence of μt\mu_{t}, hth_{t}, sts_{t}, and ktk_{t} is equivalent to that of m1tm_{1t}, m2tm_{2t}, m3tm_{3t}, and m4tm_{4t}, and the latter requires the existence of a non-decreasing function Ft()F_{t}(\cdot) such that

mjt=xj𝑑Ft(x).m_{jt}=\int_{-\infty}^{\infty}x^{j}dF_{t}(x).

To ensure this existence, Theorem 12.a in Widder (1946) indicates that the following condition must hold for m1tm_{1t}, m2tm_{2t}, m3tm_{3t}, and m4tm_{4t}:

det(m0tm1tm1tm2t)0 and det(m0tm1tm2tm1tm2tm3tm2tm3tm4t)0.\det\begin{pmatrix}m_{0t}&m_{1t}\\ m_{1t}&m_{2t}\end{pmatrix}\geq 0\,\,\,\mbox{ and }\,\,\,\det\begin{pmatrix}m_{0t}&m_{1t}&m_{2t}\\ m_{1t}&m_{2t}&m_{3t}\\ m_{2t}&m_{3t}&m_{4t}\end{pmatrix}\geq 0. (2.9)

By some direct calculations, it is not hard to see that condition (2.9) is equivalent to

ht0 and ktst210.h_{t}\geq 0\,\,\,\mbox{ and }\,\,\,k_{t}-s_{t}^{2}-1\geq 0. (2.10)

Condition (2.10) above places two necessary moment constraints on hth_{t}, sts_{t}, and ktk_{t}. When hth_{t}, sts_{t}, and ktk_{t} are specified by some parametric models with unknown parameters, the first moment constraint usually can be easily handled, but the second moment constraint restricts the admission region of unknown parameters in a very complex way, so that the model estimation becomes quite inconvenient and unstable. This is the reason why the second moment constraint has been rarely taken into account in the literature except Jondeau and Rockinger (2003).

Impressively, the moment constraints issue above is not an obstacle for our QCMs, since the QCMs estimate the CMs directly at each fixed timepoint tt. In view of the relationship between the QCMs and 𝜷^t\widehat{\bm{\beta}}_{t} in (2.8), we know that the QCMs satisfy two constraints in (2.10) if and only if β^1t20\widehat{\beta}_{1t}^{2}\geq 0 and β^1t218β^2t2+12β^1tβ^3t0\widehat{\beta}_{1t}^{2}-18\widehat{\beta}_{2t}^{2}+12\widehat{\beta}_{1t}\widehat{\beta}_{3t}\geq 0. Since the constraint β^1t20\widehat{\beta}_{1t}^{2}\geq 0 holds automatically, we indeed only need to check whether

β^1t218β^2t2+12β^1tβ^3t0.\widehat{\beta}_{1t}^{2}-18\widehat{\beta}_{2t}^{2}+12\widehat{\beta}_{1t}\widehat{\beta}_{3t}\geq 0. (2.11)

In practice, the constraint in (2.11) can be directly examined after the QCMs are computed. Our applications in Section 6 below show that this constraint holds at all examined timepoints. In other applications, if the constraint in (2.11) does not hold at some timepoints tt, we can easily re-estimate 𝜽t\bm{\theta}_{t} in (2.6) by the constrained least squares estimation method with the constraint β1t218β2t2+12β1tβ3t0\beta_{1t}^{2}-18\beta_{2t}^{2}+12\beta_{1t}\beta_{3t}\geq 0, so that the resulting QCMs satisfy the constraint in (2.11) automatically at these timepoints.

3 Asymptotics

This section studies the asymptotics of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} at a fixed timepoint tt. Let 𝑝\overset{p}{\longrightarrow} denote convergence in probability. To derive the consistency of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} in (2.8), the following assumptions are needed.

Assumption 3.1.

𝑴n𝒁𝒁/n\bm{M}_{n}\equiv\bm{Z}^{\prime}\bm{Z}/n is uniformly positive definite.

Assumption 3.2.

𝒁𝜺t/n𝑝𝟎\bm{Z}^{\prime}\bm{\varepsilon}_{t}/n\overset{p}{\longrightarrow}\boldsymbol{0} as nn\to\infty.

We offer some remarks on the aforementioned assumptions. Assumption 3.1 is regular for linear regression models, and it holds as long as 𝒁\bm{Z} is fully ranked (i.e., Rank(𝒁\bm{Z})=4). Because εt,i=εt,i+εt,iγt\varepsilon_{t,i}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ}-\gamma_{t}, Assumption 3.2 is equivalent to

Ct,+Ct,Ct,γ𝑝𝟎 as n,\displaystyle C_{t,\ast}+C_{t,\circ}-C_{t,\gamma}\overset{p}{\longrightarrow}\boldsymbol{0}\mbox{ as }n\to\infty, (3.1)

where Ct,=n1i=1n𝒁iεt,iC_{t,\ast}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\varepsilon_{t,i}^{\ast}, Ct,=n1i=1n𝒁iεt,iC_{t,\circ}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\varepsilon_{t,i}^{\circ}, and Ct,γ=(n1i=1n𝒁i)γtC_{t,\gamma}=\big{(}n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\big{)}\gamma_{t}. By law of large numbers for dependent and heteroscedastic data sequence (Andrews, 1988), it is reasonable to assert that Ct,=n1i=1n𝒁iE(εt,i)+op(1)C_{t,\ast}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}E(\varepsilon_{t,i}^{\ast})+o_{p}(1) and Ct,=n1i=1n𝒁iE(εt,i)+op(1)C_{t,\circ}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}E(\varepsilon_{t,i}^{\circ})+o_{p}(1). Then, since εt,i=εt,i+εt,i\varepsilon_{t,i}^{\bullet}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ}, condition (3.1) holds if

1ni=1n𝒁i[E(εt,i)γt]𝟎 as n.\displaystyle\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})-\gamma_{t}\big{]}\longrightarrow\boldsymbol{0}\mbox{ as }n\to\infty. (3.2)

Condition (3.2) reveals an important fact that the role of γt\gamma_{t} is to offset the possible non-identification effect caused by the non-zero mean of εt,i\varepsilon_{t,i}^{\bullet}. In other words, to achieve the identification, γt\gamma_{t} should automatically tend to minimize the absolute difference

dn,t|1ni=1n𝒁i[E(εt,i)](1ni=1n𝒁i)γt|d_{n,t}\equiv\Big{|}\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})\big{]}-\Big{(}\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\Big{)}\gamma_{t}\Big{|}

for large nn. Clearly, if dn,t0d_{n,t}\approx 0 for large nn, Condition (3.2) holds automatically, so then Assumption 3.2 most likely holds.

Next, we study the behavior of dn,td_{n,t} in different cases. For the first case that E(εt,i)ctE(\varepsilon_{t,i}^{\bullet})\approx c_{t} for all ii, we have dn,t0d_{n,t}\approx 0 with γt=ct\gamma_{t}=c_{t}. For the second case that E(εt,i)0E(\varepsilon_{t,i}^{\bullet})\approx 0 for most of ii, we also have dn,t0d_{n,t}\approx 0 with γt=0\gamma_{t}=0. For the third case that n1i=1n𝒁i[E(εt,i)]𝝉t𝒛ftn^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})\big{]}\approx\boldsymbol{\tau}_{t}\approx\boldsymbol{z}f_{t} and n1i=1n𝒁i𝒛n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\approx\boldsymbol{z} for large nn, we again have dn,t0d_{n,t}\approx 0 with γt=ft\gamma_{t}=f_{t}. For other cases, we still have the chance to ensure dn,t0d_{n,t}\approx 0, depending on the behavior of E(εt,i)E(\varepsilon_{t,i}^{\bullet}) across ii. In summary, the condition that E(εt,i)0E(\varepsilon_{t,i}^{\bullet})\approx 0 for all ii is not necessary for the validity of Assumption 3.2. This implies that the QCMs are able to have robust performances across diverse error scenarios, including situations where E(εt,i)E(\varepsilon_{t,i}^{\circ}) has the large non-zero absolute values across ii (that is, the large biases of ECQs caused by the use of mis-specified conditional quantile models).

As expected, the condition that dn,t0d_{n,t}\approx 0 for large nn may not always hold, and therefore, under certain circumstances, Assumption 3.2 may fail. For example, when the tail of yty_{t} becomes heavier, the impact of higher-order conditional moments in the Cornish-Fisher expansion becomes larger. In this case, E(εt,i)E(\varepsilon_{t,i}^{*}) tends to have a more non-negligible exotic behavior across ii, so that it is harder to offset the non-identification effect via γt\gamma_{t}, with the value of dn,td_{n,t} farther away from zero. Indeed, our simulation studies in Section 5 below indicate that the presence of non-negligible εt,i\varepsilon_{t,i}^{*} has a larger impact on the consistency of k^t\widehat{k}_{t} than h^t\widehat{h}_{t} and s^t\widehat{s}_{t}, which perform robustly in terms of the heavy-tailedness of yty_{t}.

The following theorem establishes the consistency of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t}.

Theorem 3.1.

Suppose that Assumptions 3.13.2 hold. Then, 𝛉^t𝛉t𝑝0\widehat{\bm{\theta}}_{t}-\bm{\theta}_{t}\overset{p}{\longrightarrow}0 as nn\to\infty. Consequently, h^tht𝑝0\widehat{h}_{t}-h_{t}\overset{p}{\longrightarrow}0, s^tst𝑝0\widehat{s}_{t}-s_{t}\overset{p}{\longrightarrow}0, and k^tkt𝑝0\widehat{k}_{t}-k_{t}\overset{p}{\longrightarrow}0 as nn\to\infty.

Remark 3.1.

If we assume 𝐙𝛆t/n𝟎\bm{Z}^{\prime}\bm{\varepsilon}_{t}/n\overset{}{\longrightarrow}\boldsymbol{0} almost surely as nn\to\infty in Assumption 3.2, all of convergence results in Theorem 3.1 hold almost surely.

Remark 3.2.

In Theorem 3.1, we require large nn but not large TT. Certainly, a large TT may improve the performance of ECQs with small biases, however, it is not necessary for the validity of Assumption 3.2 and thus the consistency of QCMs.

Let 𝑑\overset{d}{\longrightarrow} denote convergence in distribution. We raise the following higher order assumption to replace Assumption 3.2:

Assumption 3.3.

[𝑽t,n]1/2[𝒁𝜺t/n]𝑑N(0,𝐈)[\bm{V}_{t,n}]^{-1/2}\big{[}\bm{Z}^{\prime}\bm{\varepsilon}_{t}/\sqrt{n}\big{]}\overset{d}{\longrightarrow}N(0,\mathbf{I}) as nn\to\infty, where 𝐈\mathbf{I} is an identity matrix, and 𝐕t,nvar(𝐙𝛆t/n)\bm{V}_{t,n}\equiv var(\bm{Z}^{\prime}\bm{\varepsilon}_{t}/\sqrt{n}) is bounded and uniformly positive definite.

Assumption 3.3 is regular for proving the asymptotic normality of OLS estimator (see White (2001)). The theorem below shows that h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} are n\sqrt{n}-consistent but not asymptotically normal.

Theorem 3.2.

Suppose that Assumptions 3.1 and 3.3 hold. Then,

(𝑴n1𝑽t,n𝑴n1)1/2n(𝜽^t𝜽t)𝑑N(0,𝐈)(\bm{M}_{n}^{-1}\bm{V}_{t,n}\bm{M}_{n}^{-1})^{-1/2}\sqrt{n}(\widehat{\bm{\theta}}_{t}-\bm{\theta}_{t})\overset{d}{\longrightarrow}N(0,\mathbf{I})

as nn\to\infty. Moreover, n(h^tht)=Op(1)\sqrt{n}(\widehat{h}_{t}-h_{t})=O_{p}(1), n(s^tst)=Op(1)\sqrt{n}(\widehat{s}_{t}-s_{t})=O_{p}(1), and n(k^tkt)=Op(1)\sqrt{n}(\widehat{k}_{t}-k_{t})=O_{p}(1), but h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} are not asymptotically normal.

Remark 3.3.

Although h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} are not asymptotically normal, the asymptotic normality of 𝛉^t\widehat{\bm{\theta}}_{t} demonstrates that the quantiled volatility (the second entry of 𝛉^t\widehat{\bm{\theta}}_{t}), denoted by σ^t\widehat{\sigma}_{t}, has the asymptotic normality property: (𝚪𝐌n1𝐕t,n𝐌n1𝚪)1/2n(σ^tσt)𝑑N(0,1)(\bm{\Gamma}\bm{M}_{n}^{-1}\bm{V}_{t,n}\bm{M}_{n}^{-1}\bm{\Gamma}^{\prime})^{-1/2}\sqrt{n}(\widehat{\sigma}_{t}-\sigma_{t})\overset{d}{\longrightarrow}N(0,1) as nn\to\infty, where 𝚪=(0,1,0,0)\bm{\Gamma}=(0,1,0,0).

As shown before, the asymptotics of QCMs in Theorems 3.13.2 hold with no need to consider the specification of conditional mean and choose the specifications of conditional quantiles correctly. This important feature guarantees that the QCM method can largely reduce the risk of model mis-specification. The reason of this feature is that the QCM method is regression-based. Specifically, the conditional mean μt\mu_{t} is absorbed into the intercept parameter β0,t\beta_{0,t}, so that it has no impact on the QCMs; meanwhile, the biases of ECQs from the use of wrongly specified conditional quantile models can be aggregately offset by the term γt\gamma_{t}, which is also nested in β0,t\beta_{0,t}.

The aforementioned feature is accompanied by two limitations. The first limitation is that we need to estimate conditional quantiles nn different times. Fortunately, this limitation seems mild, since the quantile estimation commonly can be computed easily by the linear programming method and the resulting estimation biases can be tolerated by the QCM method to a large extent. The second limitation of the QCMs is that when the data are very heavy-tailed, the expansion error could have a non-negligible impact to cause the identification problem, particularly for the conditional kurtosis. This limitation seems an unavoidable consequence of the Cornish-Fisher expansion, and it could not be addressed by simply increasing the order of expansion (Lee and Lin, 1992).

4 Practical Implementations

To compute three QCMs in (2.8), we only need to input nn different ECQs Q^t(αi)\widehat{Q}_{t}(\alpha_{i}), which can be computed in many different ways; see, for example, McNeil and Frey (2000), Kuester et al. (2006), Xiao and Koenker (2009) and the references therein for earlier works, and Koenker et al. (2017) and Zheng at al. (2018) for more recent ones. Without assuming any parametric specifications of CMs, Engle and Manganelli (2004) propose a general class of CAViaR models, which can decently specify the dynamic specification of conditional quantiles. Hence, the CAViaR models are appropriate choices for us to compute Q^t(αi)\widehat{Q}_{t}(\alpha_{i}). Following Engle and Manganelli (2004), we consider four CAViaR models below:

  1. (1)

    Symmetric Absolute Value (SAV) model: Qt(α)=ψ1,0+ψ2,0Qt1(α)+ψ3,0|yt1|Q_{t}(\alpha)=\psi_{1,0}+\psi_{2,0}Q_{t-1}(\alpha)+\psi_{3,0}|y_{t-1}|;

  2. (2)

    Asymmetric Slope (AS) model: Qt(α)=ψ1,0+ψ2,0Qt1(α)+ψ3,0(yt1)++ψ4,0(yt1)Q_{t}(\alpha)=\psi_{1,0}+\psi_{2,0}Q_{t-1}(\alpha)+\psi_{3,0}(y_{t-1})^{+}+\psi_{4,0}(y_{t-1})^{-}, where (yt1)+=max(yt1,0)(y_{t-1})^{+}=\text{max}(y_{t-1},0) and (yt1)=min(yt1,0)(y_{t-1})^{-}=\text{min}(y_{t-1},0);

  3. (3)

    Indirect GARCH (IG) model: Qt(α)=(ψ1,0+ψ2,0Qt12(α)+ψ3,0yt12)1/2Q_{t}(\alpha)=(\psi_{1,0}+\psi_{2,0}Q^{2}_{t-1}(\alpha)+\psi_{3,0}y^{2}_{t-1})^{1/2};

  4. (4)

    Adaptive (ADAP) model: Qt(α)=Qt1(α)+ψ1,0{[1+exp(N[yt1Qt1(α)])]1α}Q_{t}(\alpha)=Q_{t-1}(\alpha)+\psi_{1,0}\{[1+\text{exp}(N[y_{t-1}-Q_{t-1}(\alpha)])]^{-1}-\alpha\}, where NN is a positive finite number.

Each CAViaR model above can be estimated via the classical quantile estimation method (Koenker and Bassett, 1978). For simplicity, we take the SAV model as an illustrating example. Let 𝝍=(ψ1,ψ2,ψ3)\bm{\psi}=(\psi_{1},\psi_{2},\psi_{3})^{\prime} be the unknown parameter of SAV model, and 𝝍0=(ψ1,0,ψ2,0,ψ3,0)\bm{\psi}_{0}=(\psi_{1,0},\psi_{2,0},\psi_{3,0})^{\prime} be its true value. As in Engle and Manganelli (2004), we estimate 𝝍0\bm{\psi}_{0} by the quantile estimator 𝝍^nargmin𝝍ρα(ytQt(α,𝝍))\widehat{\bm{\psi}}_{n}\equiv\arg\min_{\bm{\psi}}\rho_{\alpha}(y_{t}-Q_{t}(\alpha,\bm{\psi})), where ρα(x)=x[αI(x<0)]\rho_{\alpha}(x)=x[\alpha-\mbox{I}(x<0)] is the check function, and Qt(α,𝝍)Q_{t}(\alpha,\bm{\psi}) is defined in the same way as Qt(α)Q_{t}(\alpha) in the SAV model with 𝝍0\bm{\psi}_{0} replaced by 𝝍\bm{\psi}. Once 𝝍^n\widehat{\bm{\psi}}_{n} is obtained, we then take Q^t(α)Qt(α,𝝍^n)\widehat{Q}_{t}(\alpha)\equiv Q_{t}(\alpha,\widehat{\bm{\psi}}_{n}) as our ECQ.

Using the above CAViaR models, we can obtain different Q^t(α)\widehat{Q}_{t}(\alpha). However, at some quantile levels α\alpha, some of models may be inadequate to specify the dynamic structure of Qt(α)Q_{t}(\alpha), resulting in invalid Q^t(α)\widehat{Q}_{t}(\alpha). To screen out those invalid Q^t(α)\widehat{Q}_{t}(\alpha) before computing the QCMs, we consider the in-sample dynamic quantile (DQ) test DQIS(α)\mbox{DQ}_{IS}(\alpha) in Section 6 of Engle and Manganelli (2004). This test DQIS(α)\mbox{DQ}_{IS}(\alpha) aims to detect the inadequacy of CAViaR models by examining whether 𝑿¯t(α)Hitt(α)\bar{\bm{X}}_{t}(\alpha)\mbox{Hit}_{t}(\alpha) has mean zero, where Hitt(α)I(yt<Qt(α))α\mbox{Hit}_{t}(\alpha)\equiv\mbox{I}(y_{t}<Q_{t}(\alpha))-\alpha and 𝑿¯t(α)(Hitt1(α),,Hitt4(α))\bar{\bm{X}}_{t}(\alpha)\equiv(\mbox{Hit}_{t-1}(\alpha),...,\mbox{Hit}_{t-4}(\alpha))^{\prime}. The testing idea of DQIS(α)\mbox{DQ}_{IS}(\alpha) relies on the fact that 𝑿¯t(α)Hitt(α)\bar{\bm{X}}_{t}(\alpha)\mbox{Hit}_{t}(\alpha) has mean zero when the CAViaR model specifies the dynamic structure of Qt(α)Q_{t}(\alpha) correctly. Based on the sequence {Q^t(α)}t=1T\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T} from a given CAViaR model, we can compute DQIS(α)\mbox{DQ}_{IS}(\alpha) and then its p-value P(ξ>DQIS(α))P(\xi>\mbox{DQ}_{IS}(\alpha)), where ξχ42\xi\sim\chi_{4}^{2} (a Chi-squared distribution with 44 degrees of freedom). If the p-value of DQIS(α)\mbox{DQ}_{IS}(\alpha) is less than pp^{*}, the corresponding ECQs {Q^t(α)}\{\widehat{Q}_{t}(\alpha)\} are deemed to be invalid, so they are not included to compute the QCMs.

Below, we summarize our aforementioned procedure to compute the QCMs:

Procedure 4.1. (The steps to compute h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t})
  1. 1.

    Obtain {Q^t(α)}t=1T\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T} at any quantile level α\alpha in 𝜶\bm{\alpha} based on any CAViaR model in 𝓜\mathcal{\bm{M}}, where 𝜶[0.01:0.01:0.99]\bm{\alpha}\equiv[0.01:0.01:0.99] is a sequence of real numbers from 0.010.01 to 0.990.99 incrementing by 0.010.01, and 𝓜{SAV,AS,IG,ADAP}\mathcal{\bm{M}}\equiv\{\mbox{SAV},\,\,\mbox{AS},\,\,\mbox{IG},\,\,\mbox{ADAP}\}.

  2. 2.

    Apply the DQ test DQIS(α)\mbox{DQ}_{IS}(\alpha) to each {Q^t(α)}t=1T\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T} from Step 1, and discard those {Q^t(α)}t=1T\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T} with p-values of DQIS(α)\mbox{DQ}_{IS}(\alpha) less than pp^{*}.

  3. 3.

    Group all remaining Q^t(α)\widehat{Q}_{t}(\alpha) to form a set 𝑺t\bm{S}_{t} at each given tt. Then, take the ii-th entry of 𝑺t\bm{S}_{t} to be Yt,iY_{t,i} in (2.6), and use its corresponding quantile level to compute 𝑿i\bm{X}_{i} in (2.6), where i=1,,n0i=1,...,n_{0}, and n0n_{0} is the size of 𝑺t\bm{S}_{t}.

  4. 4.

    Based on {Yt,i,𝑿i}i=1n0\{Y_{t,i},\bm{X}_{i}\}_{i=1}^{n_{0}} from Step 3, compute the OLS estimator 𝜽^t\widehat{\bm{\theta}}_{t} in (2.7) and then the three QCMs h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} in (2.8).

In Procedure 4.1, the value of n0n_{0} decreases with that of pp^{*}, and it achieves the upper bound n=99×4n=99\times 4 when p=0p^{*}=0 (i.e., no ECQs are discarded). Clearly, the choice of pp^{*} reveals a trade-off between estimation reliability and estimation efficiency in the QCM method, since a large value of pp^{*} enhances the reliability of Q^t(α)\widehat{Q}_{t}(\alpha), but meanwhile, it reduces the efficiency of 𝜽^t\widehat{\bm{\theta}}_{t} as the value of n0n_{0} becomes small. So far, how to choose pp^{*} optimally is unclear. Our additional simulation results in the supplementary materials show that p=0.1p^{*}=0.1 is a good choice to balance the bias and variance of the estimation error of the QCMs. Hence, we recommend to take p=0.1p^{*}=0.1 for the practical use.

5 Simulations

This section examines the finite sample performance of three QCMs h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t}. For saving the space, some additional simulation results on the selection of pp^{*} are reported in the supplementary materials.

5.1 Simulations on the GARCH model

We examine the performance of QCMs when the data are generated by the benchmark GARCH model. Specifically, we generate 100 replications of sample size T=1000T=1000 from the following GARCH model (Bollerslev, 1986)

yt=ηtσt and σt2=ω+αyt12+βσt12,y_{t}=\eta_{t}\sigma_{t}\mbox{ and }\sigma_{t}^{2}=\omega+\alpha y_{t-1}^{2}+\beta\sigma_{t-1}^{2}, (5.1)

where the parameters are chosen as ω=0.1\omega=0.1, α=0.1\alpha=0.1, and β=0.8\beta=0.8, and {ηt}t=1T\{\eta_{t}\}_{t=1}^{T} is an i.i.d. sequence with ηtN(0,1)\eta_{t}\sim N(0,1) and STνtST_{\nu_{t}}. Here, STνST_{\nu} is the standardized tνt_{\nu} distribution with mean zero and variance one, and νt\nu_{t} is generated from the Uniform distribution over the interval [5,20][5,20]. For each replication, we can easily see that when ηtN(0,1)\eta_{t}\sim N(0,1), the true values of CMs and conditional quantiles of yty_{t} in (5.1) are

μt=0,ht=σt2,st=0,kt=3,Qt(α)=σt×quantile of N(0,1) at level α;\mu_{t}=0,h_{t}=\sigma_{t}^{2},s_{t}=0,k_{t}=3,Q_{t}(\alpha)=\sigma_{t}\times\mbox{quantile of }N(0,1)\mbox{ at level }\alpha;

when ηtSTνt\eta_{t}\sim ST_{\nu_{t}}, those of yty_{t} are

μt=0,ht=σt2,st=0,kt=6νt4+3,Qt(α)=σt×quantile of STνt at level α.\mu_{t}=0,h_{t}=\sigma_{t}^{2},s_{t}=0,k_{t}=\frac{6}{\nu_{t}-4}+3,Q_{t}(\alpha)=\sigma_{t}\times\mbox{quantile of }ST_{\nu_{t}}\mbox{ at level }\alpha.

Note that the expansion errors εt,i\varepsilon_{t,i}^{*} from the Cornish-Fisher expansion in regression model (2.6) are negligible when the conditional distribution of yty_{t} is normal in the case of ηtN(0,1)\eta_{t}\sim N(0,1), whereas εt,i\varepsilon_{t,i}^{*} are non-negligible when the conditional distribution of yty_{t} is heavy-tailed in the case of ηtSTνt\eta_{t}\sim ST_{\nu_{t}}; see Lee and Lin (1992).

Next, we generate the sequence {Q^t(αi)}\{\widehat{Q}_{t}(\alpha_{i})\} at each tt to compute h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} in four different cases:

Case 1 [No Error]:Q^t(αi)=Qt(αi);\displaystyle\mbox{Case 1 [No Error]:}\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i}); (5.2)
Case 2 [Error I]:Q^t(αi)=Qt(αi)+εt,i with εt,iN(0,σ2(αi));\displaystyle\mbox{Case 2 [Error I]:}\,\,\,\,\,\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i})+\varepsilon_{t,i}^{\circ}\mbox{ with }\varepsilon_{t,i}^{\circ}\sim N(0,\sigma^{2}(\alpha_{i}));
Case 3 [Error II]:Q^t(αi)=Qt(αi)+εt,i with εt,iN(μ(αi),σ2(αi));\displaystyle\mbox{Case 3 [Error II]:}\,\,\,\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i})+\varepsilon_{t,i}^{\circ}\mbox{ with }\varepsilon_{t,i}^{\circ}\sim N(\mu(\alpha_{i}),\sigma^{2}(\alpha_{i}));
Case 4 [CAViaR]:Q^t(αi) is the entry of 𝑺t,\displaystyle\mbox{Case 4 [CAViaR]:}\,\,\,\,\widehat{Q}_{t}(\alpha_{i})\mbox{ is the entry of }\bm{S}_{t},

where αi𝜶\alpha_{i}\in\bm{\alpha} in Cases 1–3, 𝜶\bm{\alpha} and 𝑺t\bm{S}_{t} are defined as in Procedure 4.1, Qt(α)Q_{t}(\alpha) is the true value of conditional quantile of yty_{t} at level α\alpha, σ2(α)=0.5σ2+|α0.5|σ2\sigma^{2}(\alpha)=0.5\sigma^{2}+|\alpha-0.5|\sigma^{2} with σ2=0.2\sigma^{2}=0.2, and μ(α)=exp(200α)I(α<0.5)+exp((2200α))I(α0.5)\mu(\alpha)=\exp(-200\alpha)\mbox{I}(\alpha<0.5)+\exp(-(2-200\alpha))\mbox{I}(\alpha\geq 0.5). Under the setting in Case 1, there has no approximation errors εt,i\varepsilon_{t,i}^{\circ} in regression model (2.6). Under the settings in Cases 2 and 3, the approximation errors εt,i\varepsilon_{t,i}^{\circ} are considered with different variances across αi\alpha_{i}, where their means are zeros (Case 2) or non-zero values (Case 3) to mimic the scenario that Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) is an unbiased or biased estimator of Qt(αi)Q_{t}(\alpha_{i}), respectively. Under the setting in Case 4, Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) are computed by the CAViaR models, and this case mimics the real application scenario that the dynamic structures of Qt(αi)Q_{t}(\alpha_{i}) are unknown and modelled by the CAViaR models.

Using the values of {Q^t(αi)}\{\widehat{Q}_{t}(\alpha_{i})\} generated by (5.2), we compute three QCMs h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} for each replication, and then we measure their precision at each tt by considering

Δh,t=h^tht,Δs,t=s^tst,Δk,t=k^tkt,\displaystyle\Delta_{h,t}=\widehat{h}_{t}-h_{t},\,\,\,\Delta_{s,t}=\widehat{s}_{t}-s_{t},\,\,\,\Delta_{k,t}=\widehat{k}_{t}-k_{t}, (5.3)

where hth_{t}, sts_{t}, and ktk_{t} are the true values of three CMs of yty_{t}. Based on the results of 100 replications, Figs 1 and 2 exhibit the boxplots of Δh,t\Delta_{h,t}, Δs,t\Delta_{s,t}, and Δk,t\Delta_{k,t} for t=1,,10t=1,...,10 under four different cases in (5.2), with ηtN(0,1)\eta_{t}\sim N(0,1) and STνtST_{\nu_{t}}, respectively. The corresponding boxplots for t11t\geq 11 are similar and they thus are not reported here to make the figure more visible. From these two figures, our findings are as follows:

  1. 1.

    When there has no approximation errors as in Case 1, h^t\widehat{h}_{t} and s^t\widehat{s}_{t} are very accurate estimators of hth_{t} and sts_{t}, regardless of the negligibility of the expansion errors. However, k^t\widehat{k}_{t} exhibits a large dispersion when the expansion errors are non-negligible in the case of ηtSTνt\eta_{t}\sim ST_{\nu_{t}}, even though it has a very small dispersion when the expansion errors are negligible in the case of ηtN(0,1)\eta_{t}\sim N(0,1). This indicates that the expansion errors typically have a minor effect on h^t\widehat{h}_{t} and s^t\widehat{s}_{t}, while their impact on k^t\widehat{k}_{t} could be more significant.

  2. 2.

    When there has approximation errors with zero means (or nonzero means) as in Case 2 (or Case 3), the median lines in the boxplots of Δh,t\Delta_{h,t}, Δs,t\Delta_{s,t}, and Δk,t\Delta_{k,t} are generally close to zero, irrespective of the negligibility of the expansion errors. These results suggest that the three QCMs are still consistent, even when Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) are biased estimators of Qt(αi)Q_{t}(\alpha_{i}) and the expansion errors are present. Compared with the results in the case of ηtN(0,1)\eta_{t}\sim N(0,1), it is expected to see that the dispersion of all QCMs becomes larger in the case of ηtSTνt\eta_{t}\sim ST_{\nu_{t}}, and this phenomenon is more evident for k^t\widehat{k}_{t}.

  3. 3.

    When Q^t(αi)\widehat{Q}_{t}(\alpha_{i}) are estimated by using the CAViaR method as in Case 4, the boxplots of Δh,t\Delta_{h,t}, Δs,t\Delta_{s,t}, and Δk,t\Delta_{k,t} show that the three QCMs are consistent across all considered situations. Surprisingly, when ηtSTνt\eta_{t}\sim ST_{\nu_{t}}, the performance of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} in Case 4 is even better than that in Case 2 or 3. This observation is probably because the approximation errors and expansion errors are canceled out in Case 4, leading to smaller gross errors in regression model (2.6) and so more accurate QCMs consequently. Therefore, our QCM method can effectively accommodate the co-existence of approximation errors and expansion errors, which is frequently encountered in real data analysis.

Refer to caption
Figure 1: The boxplots of Δh,t\Delta_{h,t}, Δs,t\Delta_{s,t}, and Δk,t\Delta_{k,t} for t=1,,10t=1,...,10 under four different cases in (5.2), where the data are generated from the standard GARCH model in (5.1) with ηtN(0,1)\eta_{t}\sim N(0,1). In each boxplot, the lines from top to bottom represent the maximum, first quartile, median, third quartile, and maximum of the data, and the outliers are plotted individually using the ‘o’ marker symbol.
Refer to caption
Figure 2: As for Fig 1, where the data are generated from the standard GARCH model in (5.1) with ηtSTνt\eta_{t}\sim ST_{\nu_{t}}.

5.2 Simulation on the ARMA–MN–GARCH model

Let MN(λ1,λ2,τ1,τ2,σ12,σ22)MN(\lambda_{1},\lambda_{2},\tau_{1},\tau_{2},\sigma_{1}^{2},\sigma_{2}^{2}) denote a mixed normal (MN) distribution, the density of which is a mixture of two normal densities of N(τ1,σ12)N(\tau_{1},\sigma_{1}^{2}) and N(τ2,σ22)N(\tau_{2},\sigma_{2}^{2}) with the weighting probabilities λ1\lambda_{1} and λ2\lambda_{2}, respectively, where λi(0,1)\lambda_{i}\in(0,1), i=1,2i=1,2, and λ1+λ2=1\lambda_{1}+\lambda_{2}=1. To examine the performance of QCMs in the presence of conditional mean specification and time-varying CMs, we generate 100 replications of sample size T=1000T=1000 from the following ARMA–MN–GARCH model (Haas et al., 2004)

yt=a0+a1yt1+ϵt+b1ϵt1,y_{t}=a_{0}+a_{1}y_{t-1}+\epsilon_{t}+b_{1}\epsilon_{t-1}, (5.4)

where ϵtMN(λ1,λ2,τ1,τ2,σ1,t2,σ2,t2)\epsilon_{t}\sim MN(\lambda_{1},\lambda_{2},\tau_{1},\tau_{2},\sigma_{1,t}^{2},\sigma_{2,t}^{2}) with τ2=(λ1/λ2)τ1\tau_{2}=-(\lambda_{1}/\lambda_{2})\tau_{1}, σ1,t2=c10+c11ϵt12+c12σ1,t12\sigma_{1,t}^{2}=c_{10}+c_{11}\epsilon_{t-1}^{2}+c_{12}\sigma_{1,t-1}^{2}, and σ2,t2=c20+c21ϵt12+c22σ2,t12\sigma_{2,t}^{2}=c_{20}+c_{21}\epsilon_{t-1}^{2}+c_{22}\sigma_{2,t-1}^{2}, and the parameters are chosen as λ1=0.2\lambda_{1}=0.2, τ1=0.4\tau_{1}=0.4, a0=0.5a_{0}=0.5, a1=0.4a_{1}=0.4, b1=0.3b_{1}=-0.3, c10=0.1c_{10}=0.1, c20=0.3c_{20}=0.3, c11=0.05c_{11}=0.05, c21=0.1c_{21}=0.1, c12=0.85c_{12}=0.85, and c22=0.8c_{22}=0.8. For each replication, we compute the true values of CMs and conditional quantiles of yty_{t} in model (5.4) as follows:

μt\displaystyle\mu_{t} =a0+a1yt1+b1ϵt1,\displaystyle=a_{0}+a_{1}y_{t-1}+b_{1}\epsilon_{t-1},
ht\displaystyle h_{t} =λ1(τ12+σ1,t2)+λ2(τ22+σ2,t2)(λ1τ1+λ2τ2)2,\displaystyle=\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})-(\lambda_{1}\tau_{1}+\lambda_{2}\tau_{2})^{2},
st\displaystyle s_{t} =λ1(τ13+3τ1σ1,t2)+λ2(τ23+3τ2σ2,t2)[λ1(τ12+σ1,t2)+λ2(τ22+σ2,t2)]3/2,\displaystyle=\frac{\lambda_{1}(\tau_{1}^{3}+3\tau_{1}\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{3}+3\tau_{2}\sigma_{2,t}^{2})}{[\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})]^{3/2}},
kt\displaystyle k_{t} =λ1(τ14+6τ12σ1,t2+3σ1,t4)+λ2(τ24+6τ22σ2,t2+3σ2,t4)[λ1(τ12+σ1,t2)+λ2(τ22+σ2,t2)]2,\displaystyle=\frac{\lambda_{1}(\tau_{1}^{4}+6\tau_{1}^{2}\sigma_{1,t}^{2}+3\sigma_{1,t}^{4})+\lambda_{2}(\tau_{2}^{4}+6\tau_{2}^{2}\sigma_{2,t}^{2}+3\sigma_{2,t}^{4})}{[\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})]^{2}},

and

Qt(α)=μt+Qtϵ(α),Q_{t}(\alpha)=\mu_{t}+Q_{t}^{\epsilon}(\alpha),

where Qtϵ(α)Q_{t}^{\epsilon}(\alpha) satisfies λ1Φ(Qtϵ(α);τ1,σ1,t2)+λ2Φ(Qtϵ(α);τ2,σ2,t2)=α\lambda_{1}\Phi(Q_{t}^{\epsilon}(\alpha);\tau_{1},\sigma_{1,t}^{2})+\lambda_{2}\Phi(Q_{t}^{\epsilon}(\alpha);\tau_{2},\sigma_{2,t}^{2})=\alpha, and Φ(x;τ,σ2)\Phi(x;\tau,\sigma^{2}) represents the normal distribution function with mean τ\tau and variance σ2\sigma^{2}.

Based on the results of 100 replications, Fig 3 exhibits the boxplots of Δh,t\Delta_{h,t}, Δs,t\Delta_{s,t}, and Δk,t\Delta_{k,t} for t=1,,10t=1,...,10 under four different cases in (5.2). From this figure, we can reach the similar conclusions as in Fig 1. Hence, although yty_{t} has a heavier tail than normal under model (5.4), the presence of conditional mean specification does not affect the performance of QCMs.

Refer to caption
Figure 3: As for Fig 1, where the data are generated from the ARMA-MN-GARCH model in (5.4).

6 A Real Application

In our empirical work, we consider the log-return (in percentage) series of three exchange rates, including the AUD to USD (AUD/USD), NZD to USD (NZD/USD), and CAD to USD (CAD/USD). We denote each log-return series by {y1,,yT}\{y_{1},...,y_{T}\}, which are computed from January 1, 2009 to April 20, 2023. See Table 1 for some basic descriptive statistics of all three log-return series. Below, we compute the three QCMs of each log-return series, and then use these QCMs to study the “News impact curve” (NIC).

Table 1: Descriptive statistics for three return series.
          AUD/USD           NZD/USD           CAD/USD
          Sample size           3730           3730           3730
          Sample mean           -0.0012           0.0015           -0.0027
          Sample variance           0.4374           0.4612           0.2056
          Sample skewness           -0.4160           -0.4485           -0.0533
          Sample kurtosis           7.0748           7.1818           5.3841

6.1 The three QCMs of return series

Following the steps in Procedure 4.1, we compute the three QCMs h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} of each log-return series, and report their basic descriptive statistics in Table 2, where the constraint (2.11) holds for all of the computed QCMs. From Tables 1 and 2, we find that for each return series, the mean of h^t\widehat{h}_{t} (or s^t\widehat{s}_{t}) is close to the corresponding sample variance (or skewness), whereas the mean of k^t\widehat{k}_{t} is much smaller than the corresponding sample kurtosis. These findings are expected, since extreme returns can affect the sample kurtosis for a prolonged period of time, but their impact on k^t\widehat{k}_{t} decays exponentially over time.

Table 2: Descriptive statistics for the three QCMs of three return series.
         AUD/USD          NZD/USD          CAD/USD
         h^t\widehat{h}_{t}          Mean          0.5400          0.5821          0.2951
         Maximum          3.0661          2.7224          1.2098
         Minimum          0.1380          0.1537          0.0784
         Ljung-Box          0.0000          0.0000          0.0000
         s^t\widehat{s}_{t}          Mean          -0.2051          -0.1274          -0.1149
         Maximum          0.0463          0.1251          0.1230
         Minimum          -0.5176          -0.4867          -0.3951
         Ljung-Box          0.0000          0.0000          0.0000
         k^t\widehat{k}_{t}          Mean          3.7057          3.5129          3.6003
         Maximum          5.0147          4.7127          4.6095
         Minimum          2.8841          2.6247          2.8776
         Ljung-Box          0.0000          0.0000          0.0000
  • \dagger The results are the p-values of Ljung-Box test (Ljung and Box, 1978).

Next, we check the validity of QCMs via a similar method as in Gu et al. (2020). Denote αh=E(eth)\alpha^{h}=E(e_{t}^{h}), αs=E(ets)\alpha^{s}=E(e_{t}^{s}), and αk=E(etk)\alpha^{k}=E(e_{t}^{k}), where eth=(ytμt)2hte_{t}^{h}=(y_{t}-\mu_{t})^{2}-h_{t}, ets=[(ytμt)/ht]3ste_{t}^{s}=[(y_{t}-\mu_{t})/\sqrt{h_{t}}]^{3}-s_{t}, and etk=[(ytμt)/ht]4kte_{t}^{k}=[(y_{t}-\mu_{t})/\sqrt{h_{t}}]^{4}-k_{t}. Based on the estimates e^th=(ytμ^t)2h^t\widehat{e}_{t}^{h}=(y_{t}-\widehat{\mu}_{t})^{2}-\widehat{h}_{t}, e^ts=[(ytμ^t)/h^t]3s^t\widehat{e}_{t}^{s}=[(y_{t}-\widehat{\mu}_{t})/\sqrt{\widehat{h}_{t}}]^{3}-\widehat{s}_{t}, and e^tk=[(ytμ^t)/h^t]4k^t\widehat{e}_{t}^{k}=[(y_{t}-\widehat{\mu}_{t})/\sqrt{\widehat{h}_{t}}]^{4}-\widehat{k}_{t}, we utilize the Student’s t tests 𝕋h\mathbb{T}^{h}, 𝕋s\mathbb{T}^{s}, and 𝕋k\mathbb{T}^{k} to test the null hypotheses h\mathbb{H}^{h}: ah=0a^{h}=0, s\mathbb{H}^{s}: as=0a^{s}=0, and k\mathbb{H}^{k}: ak=0a^{k}=0, respectively. Here, μ^t\widehat{\mu}_{t} is the estimate of conditional mean, and it is computed based on the mean specifications in Section 6.2 below. If h\mathbb{H}^{h} is not rejected by 𝕋h\mathbb{T}^{h} at the significance level α\alpha^{*}, then it is reasonable to conclude that h^t\widehat{h}_{t} is valid. Similarly, the validity of s^t\widehat{s}_{t} and k^t\widehat{k}_{t} can be examined by using 𝕋s\mathbb{T}^{s} and 𝕋k\mathbb{T}^{k}. Table 3 reports the p-values of 𝕋h\mathbb{T}^{h}, 𝕋s\mathbb{T}^{s}, and 𝕋k\mathbb{T}^{k} for all three exchange rates, and the results imply that all QCMs are valid at the significance level 5%5\%.

Table 3: The p-values of 𝕋h\mathbb{T}^{h}, 𝕋s\mathbb{T}^{s}, and 𝕋k\mathbb{T}^{k} for checking the validity of QCMs.
            AUD/USD             NZD/USD             CAD/USD
            𝕋h\mathbb{T}^{h}             0.4543             0.6207             0.1306
            𝕋s\mathbb{T}^{s}             0.3250             0.2634             0.3901
            𝕋k\mathbb{T}^{k}             0.8321             0.3900             0.6425

After checking the validity of the QCMs, we further plot the QCMs of all three return series during a sub-period from January 1, 2020 to May 1, 2020 in Fig 4. This sub-period deserves a detailed study, since it covers the 2020 stock market crash caused by the COVID-19 pandemic. For ease of visualization, the plots of all computed QCMs for the entire examined time period are not given here but available upon request. From Fig 4, we can have the following interesting findings:

  1. 1.

    Starting from March 9, there is a rapid rising trend for both h^t\widehat{h}_{t} and k^t\widehat{k}_{t} while an apparent decline trend for s^t\widehat{s}_{t} in all of examined exchange rates. These observed trends demonstrate that the volatility risk kept rising sharply in currency markets, and synchronically, the tail risk to have extremely negative returns kept increasing substantially to make things even worse. The aforementioned phenomenon is not surprising, given the global impact of the COVID-19 pandemic outbreak in early 2020, which led to the depreciation of exchange rates across almost all countries afterwards.

  2. 2.

    After March 19 or March 20, all of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} exhibited opposite trends. This is very likely because the Federal Reserve and many central banks announced their financial rescue plans on March 17. Hence, the trend reversal sheds light on the effectiveness of the issued financial plans in rescuing stock markets.

Overall, we find that the COVID-19 pandemic has a perilous impact on the examined three exchange rates, and the financial rescue plans are effective in reducing the values of conditional variance and kurtosis, while increasing the values of conditional skewness.

Refer to caption
Figure 4: The plots of h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, and k^t\widehat{k}_{t} of three return series from January 1, 2020 to May 1, 2020.

6.2 The study on NICs

The NIC initiated by Engle and Ng (1993) aims to study how past shocks (or news) {ϵi}it1\{\epsilon_{i}\}_{i\leq t-1} affect the present conditional variance hth_{t} by assuming

ht=θhht1+gh(ϵt1),h_{t}=\theta_{h}h_{t-1}+g_{h}(\epsilon_{t-1}), (6.1)

where ϵtytμt\epsilon_{t}\equiv y_{t}-\mu_{t} is the collective shock at tt, θh(0,1)\theta_{h}\in(0,1) is an unknown parameter to measure the persistence of hth_{t}, and gh()g_{h}(\cdot) is the NIC function for hth_{t} that has a specific parametric form. For example, researchers commonly assume that

gh(x)\displaystyle g_{h}(x) =ϑh,0+ϑh,1x2,\displaystyle=\vartheta_{h,0}+\vartheta_{h,1}x^{2}, (6.2)
gh(x)\displaystyle g_{h}(x) =ϑh,0+ϑh,1x2+ϑh,2x2I(x<0),\displaystyle=\vartheta_{h,0}+\vartheta_{h,1}x^{2}+\vartheta_{h,2}x^{2}\mbox{I}(x<0), (6.3)

where the specifications of gh()g_{h}(\cdot) in (6.2) and (6.3) lead to the standard GARCH (Bollerslev, 1986) and GJR-GARCH (Glosten et al., 1993) models for hth_{t}, respectively. Similar to the NIC for hth_{t} in (6.1), we can follow the ideas of Harvey and Siddique (1999) and León et al. (2005) to consider the NICs for sts_{t} and ktk_{t}:

st=θsst1+gs(ϱt1),\displaystyle s_{t}=\theta_{s}s_{t-1}+g_{s}(\varrho_{t-1}), (6.4)
kt=θkkt1+gk(ϱt1),\displaystyle k_{t}=\theta_{k}k_{t-1}+g_{k}(\varrho_{t-1}), (6.5)

where ϱtϵt/ht\varrho_{t}\equiv\epsilon_{t}/\sqrt{h_{t}} is the re-scaled collective shock at tt, θs(1,1)\theta_{s}\in(-1,1) and θk(0,1)\theta_{k}\in(0,1) are two unknown parameters to measure the persistence of sts_{t} and ktk_{t}, respectively, and gs()g_{s}(\cdot) and gk()g_{k}(\cdot) are the NIC functions for sts_{t} and ktk_{t}, respectively. As for gh()g_{h}(\cdot), gs()g_{s}(\cdot) and gk()g_{k}(\cdot) are often assumed to have certain parametric forms, such as

gs(x)\displaystyle g_{s}(x) =ϑs,0+ϑs,1x3,\displaystyle=\vartheta_{s,0}+\vartheta_{s,1}x^{3}, (6.6)
gk(x)\displaystyle g_{k}(x) =ϑk,0+ϑk,1x4;\displaystyle=\vartheta_{k,0}+\vartheta_{k,1}x^{4}; (6.7)

see, for example, Harvey and Siddique (1999) and León et al. (2005). Since hth_{t}, sts_{t}, ktk_{t}, ϵt\epsilon_{t}, and ϱt\varrho_{t} are generally unobserved, all of the unknown parameters in (6.1) and (6.4)–(6.5) have to be estimated by specifying some parametric models on yty_{t} that account for the conditional variance, skewness, and kurtosis simultaneously.

However, so far the parametric forms of gh()g_{h}(\cdot), gs()g_{s}(\cdot), and gk()g_{k}(\cdot) are chosen in an ad-hoc rather than a data-driven manner. Intuitively, if gh()g_{h}(\cdot), gs()g_{s}(\cdot), and gk()g_{k}(\cdot) can be estimated non-parametrically, we are able to get some useful information on their parametric forms. Motivated by this idea, we replace hth_{t}, sts_{t}, ktk_{t}, ϵt1\epsilon_{t-1}, and ϱt1\varrho_{t-1} in (6.1) and (6.4)–(6.5) with h^t\widehat{h}_{t}, s^t\widehat{s}_{t}, k^t\widehat{k}_{t}, ϵ^t1\widehat{\epsilon}_{t-1}, and ϱ^t1\widehat{\varrho}_{t-1}, respectively, where ϵ^t=ytμ^t\widehat{\epsilon}_{t}=y_{t}-\widehat{\mu}_{t} and ϱ^t=ϵ^t/h^t\widehat{\varrho}_{t}=\widehat{\epsilon}_{t}/\sqrt{\widehat{h}_{t}} with μ^t\widehat{\mu}_{t} being an estimator of μt\mu_{t}. After this replacement, we can get the following models:

h^t\displaystyle\widehat{h}_{t} =θhh^t1+gh(ϵ^t1)+ςh,t,\displaystyle=\theta_{h}\widehat{h}_{t-1}+g_{h}(\widehat{\epsilon}_{t-1})+\varsigma_{h,t}, (6.8)
s^t\displaystyle\widehat{s}_{t} =θss^t1+gs(ϱ^t1)+ςs,t,\displaystyle=\theta_{s}\widehat{s}_{t-1}+g_{s}(\widehat{\varrho}_{t-1})+\varsigma_{s,t}, (6.9)
k^t\displaystyle\widehat{k}_{t} =θkk^t1+gk(ϱ^t1)+ςk,t,\displaystyle=\theta_{k}\widehat{k}_{t-1}+g_{k}(\widehat{\varrho}_{t-1})+\varsigma_{k,t}, (6.10)

where ςh,t\varsigma_{h,t}, ςs,t\varsigma_{s,t}, and ςk,t\varsigma_{k,t} are model errors caused by the replacement. Since the QCMs are most likely consistent estimators of CMs, all model errors are expected to have desirable properties for valid model estimations when the parametric form of μt\mu_{t} is correctly specified.

To get the correct specification of μt\mu_{t}, we apply two Cramér-von Mises tests DT,I2D_{T,I}^{2} and DT,C2D_{T,C}^{2} in Escanciano (2006) to check whether the assumed form of μt\mu_{t} is correctly specified, where the p-values of DT,I2D_{T,I}^{2} and DT,C2D_{T,C}^{2} are computed via the bootstrap method in Escanciano (2006). Since the strong autocorrelations are detected for the three return series, we adopt an order pp threshold autoregressive (TAR(pp)) model (Tong, 1978) with threshold variable being zero and delay variable being one to fit these three return series. After dismissing insignificant parameters, the AUD/USD, NZD/USD, and CAD/USD exchange rates are fitted by the TAR(5), TAR(9), and TAR(6) models, respectively. The p-values of DT,I2D_{T,I}^{2} and DT,C2D_{T,C}^{2} in Table 4 indicate that these TAR models are correct specifications for the three return series at the significance level 5%.

Table 4: The p-values of DT,I2D_{T,I}^{2} and DT,C2D_{T,C}^{2} for checking the conditional mean specification.
            AUD/USD             NZD/USD             CAD/USD
            DT,I2D_{T,I}^{2}             0.7700             0.5400             0.4200
            DT,C2D_{T,C}^{2}             0.7200             0.4500             0.4100

After estimating the chosen specifications of μt\mu_{t} above by the least squares method, we are able to obtain ϵ^t1\widehat{\epsilon}_{t-1} and ϱ^t1\widehat{\varrho}_{t-1}. Define Kb()=K(/b)/bK_{b}(\cdot)=K(\cdot/b)/b, where K()K(\cdot) is Gaussian kernel function and bb is the bandwidth. Then, based on the sample sequence {(h^t,h^t1,ϵ^t1)}t=2T\{(\widehat{h}_{t},\widehat{h}_{t-1},\widehat{\epsilon}_{t-1})\}_{t=2}^{T}, we use the method in Robinson (1988) to estimate θh\theta_{h} by

θ^h={t=2T[h^t1ϕ1(ϵ^t1)]2}1{t=2T[h^t1ϕ1(ϵ^t1)][h^tϕ2(ϵ^t1)]},\widehat{\theta}_{h}=\Big{\{}\sum_{t=2}^{T}[\widehat{h}_{t-1}-\phi_{1}(\widehat{\epsilon}_{t-1})]^{2}\Big{\}}^{-1}\Big{\{}\sum_{t=2}^{T}[\widehat{h}_{t-1}-\phi_{1}(\widehat{\epsilon}_{t-1})][\widehat{h}_{t}-\phi_{2}(\widehat{\epsilon}_{t-1})]\Big{\}},

where

ϕ1()=s=2TKb1(ϵ^s1)h^s1s=2TKb1(ϵ^s1) and ϕ2()=s=2TKb2(ϵ^s1)h^ss=2TKb2(ϵ^s1),\phi_{1}(\cdot)=\frac{\sum_{s=2}^{T}K_{b_{1}}(\cdot-\widehat{\epsilon}_{s-1})\widehat{h}_{s-1}}{\sum_{s=2}^{T}K_{b_{1}}(\cdot-\widehat{\epsilon}_{s-1})}\mbox{ and }\phi_{2}(\cdot)=\frac{\sum_{s=2}^{T}K_{b_{2}}(\cdot-\widehat{\epsilon}_{s-1})\widehat{h}_{s}}{\sum_{s=2}^{T}K_{b_{2}}(\cdot-\widehat{\epsilon}_{s-1})},

and the values of b1b_{1} and b2b_{2} are chosen by the conventional cross-validation method. Next, we estimate gh()g_{h}(\cdot) non-parametrically by g^h()=s=2TKb3(ϵ^s1)Rh,s/s=2TKb3(ϵ^s1)\widehat{g}_{h}(\cdot)=\sum_{s=2}^{T}K_{b_{3}}(\cdot-\widehat{\epsilon}_{s-1})R_{h,s}/\sum_{s=2}^{T}K_{b_{3}}(\cdot-\widehat{\epsilon}_{s-1}), where Rh,t=h^tθ^hh^t1R_{h,t}=\widehat{h}_{t}-\widehat{\theta}_{h}\widehat{h}_{t-1}, and the value of b3b_{3} is chosen by the cross-validation method. Similarly, based on the sample sequences {(s^t,s^t1,ϱ^t1)}t=2T\{(\widehat{s}_{t},\widehat{s}_{t-1},\widehat{\varrho}_{t-1})\}_{t=2}^{T} and {(k^t,k^t1,ϱ^t1)}t=2T\{(\widehat{k}_{t},\widehat{k}_{t-1},\widehat{\varrho}_{t-1})\}_{t=2}^{T}, we estimate gs()g_{s}(\cdot) and gk()g_{k}(\cdot) non-parametrically by g^s()\widehat{g}_{s}(\cdot) and g^k()\widehat{g}_{k}(\cdot), respectively.

Fig 5 plots the non-parametric fitted models g^h()\widehat{g}_{h}(\cdot), g^s()\widehat{g}_{s}(\cdot), and g^k()\widehat{g}_{k}(\cdot) for all three return series. From this figure, we find that the form of gh()g_{h}(\cdot) in (6.2) or (6.3) matches g^h()\widehat{g}_{h}(\cdot) quite well, whereas the forms of gs()g_{s}(\cdot) in (6.6) and gk()g_{k}(\cdot) in (6.7) exhibit a large deviation from g^s()\widehat{g}_{s}(\cdot) and g^k()\widehat{g}_{k}(\cdot), respectively, in all three cases. The same conclusion can be reached in view of the results of adjusted R2 for all fitted models in Table 5.

Refer to caption
Figure 5: The plots of all fitted NICs for hth_{t}, sts_{t}, and ktk_{t}. Left panels: the non-parametric g^h()\widehat{g}_{h}(\cdot) (dashed lines); the parametric gh()g_{h}(\cdot) in (6.2) (dotted lines) and (6.3) (solid lines). Middle panels: the non-parametric g^s()\widehat{g}_{s}(\cdot) (dashed lines); the parametric gs()g_{s}(\cdot) in (6.6) (dotted lines). Right panels: the non-parametric g^k()\widehat{g}_{k}(\cdot) (dashed lines); the parametric gk()g_{k}(\cdot) in (6.7) (dotted lines).
Table 5: The values of adjusted R2 for the fitted models (6.8)–(6.10).
           AUD/USD            NZD/USD            CAD/USD
           Panel A: Model (6.8)
           gh()(6.2)g_{h}(\cdot)\sim(\ref{6_model_1})            0.9215            0.9270            0.9551
           gh()(6.3)g_{h}(\cdot)\sim(\ref{6_model_2})            0.9363            0.9357            0.9628
           Panel B: Model (6.9)
           gs()(6.6)g_{s}(\cdot)\sim(\ref{6_model_3})            0.0920            0.1316            0.4754
           Panel C: Model (6.10)
           gk()(6.7)g_{k}(\cdot)\sim(\ref{6_model_4})            0.1416            0.3008            0.2702

7 Concluding Remarks

This paper estimates the three CMs (with respect to variance, skewness, and kurtosis) by the corresponding QCMs, which are easily computed from the OLS estimator of a linear regression model constructed by the ECQs. The QCM method builds on the Cornish-Fisher expansion, which essentially transforms the estimation of CM to that of conditional quantile. Such transformation brings us two attractive advantages over the parametric GARCH-type methods: First, the QCM method bypasses the investigation of conditional mean and allows the mis-specified conditional quantile models, due to its regression-based feature; Second, the QCM method has stable estimation results, since it does not involve any complex nonlinear constraint on the admission region of parameters in conditional quantile models. These two advantages come with two limitations. The first limitation is that we need to do the conditional quantile estimation nn different times. But this seems neither a computational burden nor a theoretical obstruct. The second limitation is that when the data are more heavy-tailed, the CF expansion is unavoidable to be less accurate, leading to more non-negligible expansion error. Although this limitation does not affect the consistency of the QCMs in general, it makes the QCMs (especially the quantiled conditional kurtosis) have a larger dispersion when the data are more heavy-tailed.

Notably, the QCM method is a supervised learning procedure without assuming any distribution on the returns. This important supervised learning feature makes the QCM method have a substantially computational advantage over the GARCH-type methods to estimate conditional variance, skewness, and kurtosis, when the dimension of return is large. See Zhu et al. (2023) for an in-depth discussion on this context and an innovative method for big portfolio selections based on the learned conditional higher-moments from the QCM method.

Finally, we should mention that the existing parametric methods commonly only work for stationary data and their extension to deal with more complex data environments seems challenging in terms of methodology and computation. In contrast, the QCM method could be applicable in complex data environments, as long as the ECQs are fairly provided. For example, the QCMs could adapt to the mixed categorical and continuous data or locally stationary data when the ECQs are computed by the method in Li and Racine (2008) or Zhou and Wu (2009), respectively. In addition, the useful information from exogenous variables and conditional quantiles of other variables can be easily embedded to the QCMs through the channels of the ECQs as done in Härdle et al. (2016) and Tobias and Brunnermeier (2016). Since the QCMs are computed at each fixed timepoint, the QCM method also allows us to pay special attention to the CMs during a specific time period by employing the methods in Cai (2002) and Xu (2013) to compute the ECQs. On the whole, the QCM method exhibits a much wider application scope than the parametric ones, which so far have not offered a clear feasible manner to study the CMs under the above complex data environments.

References

  • Andrews (1988) Andrews, D. W. K. (1988). Laws of large numbers for dependent nonidentically distributed random variables. Econometric Theory 4, 458–467.
  • Bali et al. (2008) Bali, T. G., Mo, H. and Tang, Y. (2008). The role of autoregressive conditional skewness and kurtosis in the estimation of conditional VaR. Journal of Banking & Finance 32, 269–282.
  • Bollerslev (1986) Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327.
  • Brooks et al. (2005) Brooks, C., Burke, S. P., Heravi, S. and Persand, G. (2005). Autoregressive conditional kurtosis. Journal of Financial Econometrics 3, 399–421.
  • Cai (2002) Cai, Z. (2002). Regression quantiles for time series. Econometric Theory 18, 169–192.
  • Chunhachinda et al. (1997) Chunhachinda, P., Dandapani, K., Hamid, S. and Prakash, A. J. (1997). Portfolio selection and skewness: Evidence from international stock markets. Journal of Banking & Finance 21, 143–167.
  • Cornish and Fisher (1938) Cornish, E. A. and Fisher, R. A. (1938). Moments and cumulants in the specification of distributions. Revue de l’Institut international de Statistique 5, 307–320.
  • Engle (1982) Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50, 987–1007.
  • Engle and Manganelli (2004) Engle, R. F. and Manganelli, S. (2004). CAViaR: Conditional autoregressive value at risk by regression quantiles. Journal of Business & Economic Statistics 22, 367–381.
  • Engle and Ng (1993) Engle, R. F. and Ng, V. K. (1993). Measuring and testing the impact of news on volatility. Journal of Finance 48, 1749–1778.
  • Escanciano (2006) Escanciano, J. C. (2006). Goodness-of-fit tests for linear and nonlinear time series models. Journal of the American Statistical Association 101, 140–149.
  • Escanciano (2009) Escanciano, J. C. (2009). Quasi-maximum likelihood estimation of semi-strong GARCH models. Econometric Theory 25, 561–570.
  • Escanciano and Velasco (2006) Escanciano, J. C. and Velasco, C. (2006). Generalized spectral tests for the martingale difference hypothesis. Journal of Econometrics 134, 151–185.
  • Fan and Yao (2003) Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
  • Francq and Sucarrat (2022) Francq, C. and Sucarrat, G. (2022). Volatility estimation when the zero-process is nonstationary. Journal of Business & Economic Statistics 41, 53–66.
  • Francq and Thieu (2019) Francq, C. and Thieu, L. Q. (2019). QML inference for volatility models with covariates. Econometric Theory 35, 37–72.
  • Francq and Zakoïan (2019) Francq, C. and Zakoïan, J. M. (2019). GARCH Models: Structure, Statistical Inference and Financial Applications. John Wiley & Sons.
  • Glosten et al. (1993) Glosten, L. R., Jagannathan, R. and Runkle, D. E. (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance 48, 1779–1801.
  • Grigoletto and Lisi (2009) Grigoletto, M. and Lisi, F. (2009). Looking for skewness in financial time series. Econometrics Journal 12, 310–323.
  • Gu et al. (2020) Gu, S., Kelly, B. and Xiu, D. (2020). Empirical asset pricing via machine learning. Review of Financial Studies 33, 2223–2273.
  • Haas et al. (2004) Haas, M., Mittnik, S. and Paolella, M. S. (2004). Mixed normal conditional heteroskedasticity. Journal of Financial Econometrics 2, 211–250.
  • Hansen (1994) Hansen, B. E. (1994). Autoregressive conditional density estimation. International Economic Review 35, 705–730.
  • Härdle et al. (2016) Härdle, W. K., Wang, W. and Yu, L. (2016). TENET: Tail-Event driven NETwork risk. Journal of Econometrics 192, 499–513.
  • Harvey and Siddique (1999) Harvey, C. R. and Siddique, A. (1999). Autoregressive conditional skewness. Journal of Financial and Quantitative Analysis 34, 465–487.
  • Harvey and Siddique (2000) Harvey, C. R. and Siddique, A. (2000). Conditional skewness in asset pricing tests. Journal of Finance 55, 1263–1295.
  • Jondeau and Rockinger (2003) Jondeau, E. and Rockinger, M. (2003). Conditional volatility, skewness, and kurtosis: existence, persistence, and comovements. Journal of Economic Dynamics and Control 27, 1699–1737.
  • Jondeau et al. (2019) Jondeau, E., Zhang, Q. and Zhu, X. (2019). Average skewness matters. Journal of Financial Economics 134, 29–47.
  • Koenker and Bassett (1978) Koenker, R. and Bassett, G. (1978). Regression quantiles. Econometrica 46, 33–50.
  • Koenker et al. (2017) Koenker, R., Chernozhukov, V., He, X. and Peng, L. (2017). Handbook of Quantile Regression. Chapman & Hall/CRC.
  • Kuester et al. (2006) Kuester, K., Mittnik, S. and Paolella, M. S. (2006). Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics 4, 53–89.
  • Lee and Lin (1992) Lee, Y. S. and Lin, T. K. (1992). Algorithm AS 269: High order Cornish-Fisher expansion. Journal of the Royal Statistical Society: Series C 41, 233–240.
  • León and N̄íguez (2020) León, Á. and N̄íguez, T. M. (2020). Modeling asset returns under time-varying semi-nonparametric distributions. Journal of Banking & Finance 118, 105870.
  • León et al. (2005) León, Á., Rubio, G. and Serna, G. (2005). Autoregresive conditional volatility, skewness and kurtosis. Quarterly Review of Economics and Finance 45, 599–618.
  • Li and Racine (2008) Li, Q. and Racine, J. S. (2008). Nonparametric estimation of conditional CDF and quantile functions with mixed categorical and continuous data. Journal of Business & Economic Statistics 26, 423–434.
  • Ljung and Box (1978) Ljung, G. M. and Box, G. E. (1978). On a measure of lack of fit in time series models. Biometrika 65, 297–303.
  • McNeil and Frey (2000) McNeil, A. J. and Frey, R. (2000). Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach. Journal of Empirical Finance 7, 271–300.
  • Robinson (1988) Robinson, P. M. (1988). Root-N-consistent semiparametric regression. Econometrica 56, 931–954.
  • Rubinstein (1973) Rubinstein, M. E. (1973). A comparative statics analysis of risk premiums. Journal of Business 46, 605–615.
  • Samuelson (1970) Samuelson, P. A. (1970). The fundamental approximation theorem of portfolio analysis in terms of means, variances and higher moments. Review of Economic Studies 37, 537–542.
  • Sucarrat and Grønneberg (2022) Sucarrat, G. and Grønneberg, S. (2022). Risk estimation with a time-varying probability of zero returns. Journal of Financial Econometrics 20, 278–309.
  • Tobias and Brunnermeier (2016) Tobias, A. and Brunnermeier, M. K. (2016). CoVaR. American Economic Review 106, 1705–1741.
  • Tong (1978) Tong, H. (1978). On a threshold model. Pattern recognition and signal processing. (ed. C.H.Chen). Sijthoff and Noordhoff, Amsterdam.
  • Tsay (2005) Tsay, R. S. (2005). Analysis of Financial Time Series. John Wiley & Sons.
  • White (2001) White, H. (2001). Asymptotic Theory for Econometricians, Revised edition. San Diego: Academic Press.
  • Widder (1946) Widder, D. V. (1946). The Laplace Transform. Princeton University Press, Princeton, NJ.
  • Xiao and Koenker (2009) Xiao, Z. and Koenker, R. (2009). Conditional quantile estimation for generalized autoregressive conditional heteroscedasticity models. Journal of the American Statistical Association 104, 1696–1712.
  • Xu (2013) Xu, K. L. (2013). Nonparametric inference for conditional quantiles of time series. Econometric Theory 29, 673–698.
  • Zheng at al. (2018) Zheng, Y., Zhu, Q., Li, G. and Xiao, Z. (2018). Hybrid quantile regression estimation for time series models with conditional heteroscedasticity. Journal of the Royal Statistical Society: Series B 80, 975–993.
  • Zhou and Wu (2009) Zhou, Z. and Wu, W. B. (2009). Local linear quantile estimation for nonstationary time series. Annals of Statistics 37, 2696–2729.
  • Zhu et al. (2023) Zhu, Z., Zhang, N. and Zhu, K. (2023). Big portfolio selection by graph-based conditional moments method. Working paper. Available at “https://arxiv.org/abs/2301.11697”.