Quantiled conditional variance, skewness, and kurtosis by Cornish-Fisher expansion

Ningning Zhang Ke Zhu University of Hong Kong

Abstract

The conditional variance, skewness, and kurtosis play a central role in time series analysis. These three conditional moments (CMs) are often studied by some parametric models but with two big issues: the risk of model mis-specification and the instability of model estimation. To avoid the above two issues, this paper proposes a novel method to estimate these three CMs by the so-called quantiled CMs (QCMs). The QCM method first adopts the idea of Cornish-Fisher expansion to construct a linear regression model, based on $n$ different estimated conditional quantiles. Next, it computes the QCMs simply and simultaneously by using the ordinary least squares estimator of this regression model, without any prior estimation of the conditional mean. Under certain conditions, the QCMs are shown to be consistent with the convergence rate $n^{-1/2}$ . Simulation studies indicate that the QCMs perform well under different scenarios of Cornish-Fisher expansion errors and quantile estimation errors. In the application, the study of QCMs for three exchange rates demonstrates the effectiveness of financial rescue plans during the COVID-19 pandemic outbreak, and suggests that the existing “news impact curve” functions for the conditional skewness and kurtosis may not be suitable.

Conditional moments; Cornish-Fisher expansion; News impact curve; Quantile time series estimation; Quantiled conditional moments.,

keywords:

\startlocaldefs\endlocaldefs

and

t1Address correspondence to Ke Zhu: Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong. E-mail: mazhuke@hku.hk

1 Introduction

Learning the conditional variance, skewness, and kurtosis of a univariate time series is the core issue in many financial and economic applications. The classical tools to study the conditional variance are the generalized autoregressive conditional heterosecdasticity (GARCH) model and its variants. See Engle (1982), Bollerslev (1986), Francq and Zakoïan (2019), and references therein. However, except for some theoretical works on parameter estimation in Escanciano (2009) and Francq and Thieu (2019), the GARCH-type models commonly assume independent and identically distributed (i.i.d.) innovations, resulting in the constant conditional skewness and kurtosis. As argued by Samuelson (1970) and Rubinstein (1973), the higher moments like skewness and kurtosis are nonnegligible, since they are not only exemplary evidence of the non-normal returns but also relevant to the investor’s optimal decision. Along this way, a large body of literature has demonstrated the importance of conditional skewness and kurtosis in portfolio selection (Chunhachinda et al., 1997), asset pricing (Harvey and Siddique, 2000), risk management (Bali et al., 2008), return predictability (Jondeau et al., 2019), and many others. These empirical successes indicate the necessity to learn the dynamic structures of the conditional skewness and kurtosis simultaneously with the conditional variance.

Although there is a large number of studies on the conditional variance, only a few of them have taken account of the conditional skewness and kurtosis. The pioneer works towards this goal are Hansen (1994) and Harvey and Siddique (1999), followed by Jondeau and Rockinger (2003), Brooks et al. (2005), León et al. (2005), Grigoletto and Lisi (2009), and León and N̄íguez (2020). All of these works assume a peculiar conditional distribution on the innovations of GARCH-type model, where the conditional skewness or kurtosis either directly has an analogous GARCH-type dynamic structure rooting in re-scaled shocks or indirectly depends on dynamic structure of distribution parameters. See also Francq and Sucarrat (2022) and Sucarrat and Grønneberg (2022) for a different investigation of conditional skewness and kurtosis via the GARCH-type model with a time-varying probability of zero returns. However, the aforementioned parametric methods have two major shortcomings: First, they inevitably have the risk of using wrongly specified parametric models or innovation distributions; Second, they usually have unstable model estimation results in the presence of dynamic structure of skewness and kurtosis.

This paper proposes a new novel method to simultaneously learn the conditional variance, skewness, and kurtosis by the so-called quantiled conditional variance, skewness, and kurtosis, respectively. Our three quantiled conditional moments (QCMs) are formed based on the spirit of the Cornish-Fisher expansion (Cornish and Fisher, 1938), which exhibits a fundamental relationship between the conditional quantiles and CMs. By replacing the unknown conditional quantiles with their estimators at $n$ quantile levels, the QCMs (with respect to variance, skewness, and kurtosis) are simply computed at each fixed timepoint by using the ordinary least squares (OLS) estimator of a linear regression model, which stems naturally from the Cornish-Fisher expansion. Surprisingly, our way to compute the QCMs does not require any estimator of the conditional mean. The precision of the QCMs is controlled by the errors of the proposed linear regression model, which comprises two components: First, the expansion error encompasses higher-order conditional moments in the Cornish-Fisher expansion that are not taken as the regressors in the linear regression model; Second, the approximation error arises from the use of estimated conditional quantiles (ECQs). Under certain conditions on the regression model error, we show that the QCMs are consistent estimators of the corresponding CMs with the convergence rate $n^{-1/2}$ . Simulation studies reveal that, when considering various scenarios of approximation errors caused by the biased ECQs from the use of contaminated or mis-specified conditional quantile models, (i) the quantiled conditional variance and skewness exhibit robust and satisfactory performance, regardless of the non-negligibility of expansion error; (ii) the quantiled conditional kurtosis has a larger dispersion for heavier-tailed data with non-negligible expansion error, which is unavoidable due to the less accuracy of CF expansion on the heavier-tailed distributions (Lee and Lin, 1992).

In the application, we study the QCMs of return series for three exchange rates. During the COVID-19 pandemic outbreak in March 2020, we find that the values of quantiled conditional variance and kurtosis increased rapidly, and the values of quantiled conditional skewness decreased sharply before March 19 or 20 in all of examined exchange rates, shedding light on the worldwide perilous financial crisis at that time. After March 19 or 20, we find that the values of quantiled conditional variance, skewness, and kurtosis exhibited totally opposite trends, so demonstrating the effectiveness of financial rescue plans issued by governments. Moreover, since the existing parametric forms of “news impact curve” (NIC) functions for the CMs are chosen in an ad-hoc way (Engle and Ng, 1993; Harvey and Siddique, 1999; León et al., 2005), we give a data-driven method to scrutinize the parametric forms of the NIC functions by using the QCMs. Our findings suggest that the parametric forms of NIC functions for conditional variance are appropriate, while those for conditional skewness and kurtosis may be unsuitable.

It is worth noting that our QCM method essentially transforms the problem of CM estimation to that of conditional quantile estimation. This brings us two major advantages over the aforementioned parametric methods, although we need do the conditional quantile estimation $n$ different times to implement the QCM method, and could face the risk of getting inaccurate estimate of conditional kurtosis for the very heavy-tailed data.

First, the QCM method can largely reduce the risk of model mis-specification, since the QCMs are simultaneously computed without any prior estimation of the conditional mean, and their consistency holds even when the specifications of conditional quantiles are mis-specified. This advantage is attractive and unexpected, since we usually have to estimate conditional mean, variance, skewness (or kurtosis) successively via some correctly specified parametric models. The reason of this advantage is that the QCM method is regression-based. Specifically, the conditional mean formally becomes one part of the intercept parameter, so it has no impact on the QCMs that are computed only from the OLS estimator of all non-intercept parameters; meanwhile, the impact of biased ECQs from the use of wrongly specified conditional quantile models can be aggregately offset by another part of the intercept parameter, ensuring the consistency of QCMs to a large extent. In a sense, without specifying any parametric forms of CMs, this important feature allows us to view the QCMs as the “observed” CMs, and consequently, many intriguing but hardly implemented empirical studies could become tractable based on the QCMs (see, e.g., our empirical studies on NICs for the CMs).

Second, the QCM method can numerically deliver more stable estimators of the CMs than the parametric methods. As shown in Jondeau and Rockinger (2003), there exists a moment issue that places a necessary nonlinear constraint on the conditional skewness and kurtosis, leading to a complex restriction on the admission region of model parameters. This restriction not only raises the computational burden of parameter estimation but also makes the estimation result unstable, so it has been rarely considered in the existing parametric methods. In contrast, the QCM method directly computes the QCMs at each fixed timepoint, and this interesting feature ensures that the nonlinear constraint on the conditional skewness and kurtosis can be simply examined using the computed QCMs at each timepoint. Particularly, if this nonlinear constraint is violated at some timepoints, it is straightforward to replace the OLS estimator with a constrained least squares estimator to propose the QCMs which then satisfy the constraint automatically.

The remaining paper is organized as follows. Section 2 proposes the QCMs based on the linear regression, and discusses the issues of conditional mean and moment constraints. Section 3 establishes the asymptotics of the QCMs. Section 4 provides the practical implementations of the QCMs. Simulation studies are given in Section 5. An application to study the QCMs for three exchange rates and their related NICs is offered in Section 6. Concluding remarks are presented in Section 7. Proofs and some additional simulation results are deferred into the supplementary materials.

2 Quantiled Conditional Moments

2.1 Definition

Let $\{y_{1},...,y_{T}\}$ be a time series of interest with length $T$ , and $\mathcal{F}_{t}\equiv\sigma(y_{s};s\leqslant t)$ be its available information set up to time $t$ . Given $\mathcal{F}_{t-1}$ , the conditional mean, variance, skewness, and kurtosis of $y_{t}$ at timepoint $t$ are defined as $\mu_{t}=E(y_{t}|\mathcal{F}_{t-1})$ and

\displaystyle h_{t}=E[(y_{t}-\mu_{t})^{2}|\mathcal{F}_{t-1}],\text{ }s_{t}=E\Big{(}\Big{(}\frac{y_{t}-\mu_{t}}{\sqrt{h_{t}}}\Big{)}^{3}|\mathcal{F}_{t-1}\Big{)},\text{ }k_{t}=E\Big{(}\Big{(}\frac{y_{t}-\mu_{t}}{\sqrt{h_{t}}}\Big{)}^{4}|\mathcal{F}_{t-1}\Big{)},

(2.1)

respectively. Below, we show how to estimate these three conditional moments in (2.1) by using the Cornish-Fisher expansion (Cornish and Fisher, 1938) at a fixed timepoint $t$ .

Let $Q_{t}(\alpha)$ be the conditional quantile of $y_{t}$ at the quantile level $\alpha\in(0,1)$ . According to the Cornish-Fisher expansion, we have

Q_{t}(\alpha)=\mu_{t}+\sqrt{h_{t}}\Big{[}x+(x^{2}-1)\frac{s_{t}}{6}+(x^{3}-3x)\frac{k_{t}-3}{24}+r_{t}(\alpha)\Big{]},

(2.2)

where $x=\Phi^{-1}(\alpha)$ with $\Phi(\cdot)$ being the distribution function of $N(0,1)$ , and $r_{t}(\alpha)$ contains all remaining terms on the higher-order conditional moments. Taking $n$ quantile levels $\alpha_{i}$ , $i=1,...,n$ , equation (2.2) entails the following regression model with deterministic explanatory variables $\bm{X}_{i}$ but random coefficients $\bm{\beta}_{t}$ :

\displaystyle Y_{t,i}^{\ast}=\mu_{t}+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}^{\ast},\,\,i=1,...,n,

(2.3)

where $Y_{t,i}^{\ast}=Q_{t}(\alpha_{i})$ , $\bm{X}_{i}=(x_{i},x_{i}^{2}-1,x_{i}^{3}-3x_{i})^{\prime}$ with $x_{i}=\Phi^{-1}(\alpha_{i})$ ,

\displaystyle\bm{\beta}_{t}\equiv(\beta_{1t},\beta_{2t},\beta_{3t})^{\prime}=\Big{(}\sqrt{h_{t}},\frac{\sqrt{h_{t}}s_{t}}{6},\frac{\sqrt{h_{t}}(k_{t}-3)}{24}\Big{)}^{\prime},

(2.4)

and $\varepsilon_{t,i}^{\ast}=\sqrt{h_{t}}r_{t}(\alpha_{i})$ . We call $\varepsilon_{t,i}^{\ast}$ the expansion error, since it comes from the Cornish-Fisher expansion but can not be explained by $\bm{X}_{i}$ adequately.

Next, we aim to obtain the estimators of $h_{t}$ , $s_{t}$ , and $k_{t}$ through the estimator of $\bm{\beta}_{t}$ in (2.4). To achieve this goal, we replace the unobserved $Y_{t,i}^{\ast}$ with its estimator $Y_{t,i}$ , and then rewrite model (2.3) as follows:

\displaystyle Y_{t,i}=\mu_{t}+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}^{\bullet},\,\,i=1,...,n,

(2.5)

where $Y_{t,i}=\widehat{Q}_{t}(\alpha_{i})$ with $\widehat{Q}_{t}(\alpha_{i})$ being an estimator of $Q_{t}(\alpha_{i})$ , and $\varepsilon_{t,i}^{\bullet}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ}$ with $\varepsilon_{t,i}^{\circ}=\widehat{Q}_{t}(\alpha_{i})-Q_{t}(\alpha_{i})$ . Clearly, $\varepsilon_{t,i}^{\circ}$ quantifies the error caused by using $\widehat{Q}_{t}(\alpha_{i})$ to approximate $Q_{t}(\alpha_{i})$ , so it can be termed as the approximation error. Consequently, $\varepsilon_{t,i}^{\bullet}$ considering the total number of $\varepsilon_{t,i}^{\ast}$ and $\varepsilon_{t,i}^{\circ}$ can be viewed as the gross error. We should mention that any two quantile levels $\alpha_{i}$ and $\alpha_{j}$ in (2.5) are allowed to be the same, as long as $Y_{t,i}$ and $Y_{t,j}$ are different due to the use of two different conditional quantile estimation methods. In other words, model (2.5) allows us to simply pool different information of conditional quantiles from different channels of estimation methods at any fixed quantile level.

Although $\varepsilon_{t,i}^{\bullet}$ is expected to have values oscillating around zero, it may not always have mean zero. Therefore, for the purpose of identification, we add a deterministic term $\gamma_{t}$ into model (2.5) to form the following regression model:

\displaystyle Y_{t,i}

\displaystyle=(\mu_{t}+\gamma_{t})+\bm{X}_{i}^{\prime}\bm{\beta}_{t}+\varepsilon_{t,i}\equiv\bm{Z}_{i}^{\prime}\bm{\theta}_{t}+\varepsilon_{t,i},\,\,i=1,...,n,

(2.6)

where $\varepsilon_{t,i}=\varepsilon_{t,i}^{\bullet}-\gamma_{t}$ , $\bm{Z}_{i}=(1,\bm{X}_{i}^{\prime})^{\prime}$ , and $\bm{\theta}_{t}=(\beta_{0t},\bm{\beta}_{t}^{\prime})^{\prime}$ with the intercept parameter $\beta_{0t}=\mu_{t}+\gamma_{t}$ .

Let $\bm{Y}_{t}$ be an $n\times 1$ vector with entries $Y_{t,i}$ , $\bm{Z}$ be an $n\times 4$ matrix with rows $\bm{Z}_{i}^{\prime}$ , and $\bm{\varepsilon}_{t}$ be an $n\times 1$ vector with entries $\varepsilon_{t,i}$ . Then, the ordinary least squares (OLS) estimator of $\bm{\theta}_{t}$ in (2.6) is

\displaystyle\widehat{\bm{\theta}}_{t}\equiv(\widehat{\beta}_{0t},\widehat{\bm{\beta}}_{t}^{\prime})^{\prime}=(\bm{Z}^{\prime}\bm{Z})^{-1}\bm{Z}^{\prime}\bm{Y}_{t}.

(2.7)

According to (2.4), we rationally use $\widehat{\bm{\beta}}_{t}\equiv(\widehat{\beta}_{1t},\widehat{\beta}_{2t},\widehat{\beta}_{3t})^{\prime}$ in (2.7) to propose the estimators $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ for $h_{t}$ , $s_{t}$ , and $k_{t}$ , respectively, where

\widehat{h}_{t}=\widehat{\beta}_{1t}^{\,2},\,\,\widehat{s}_{t}=\frac{6\widehat{\beta}_{2t}}{\widehat{\beta}_{1t}},\text{ and }\widehat{k}_{t}=\frac{24\widehat{\beta}_{3t}}{\widehat{\beta}_{1t}}+3.

(2.8)

We call $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ the quantiled conditional variance, skewness, and kurtosis of $y_{t}$ , since they are estimators of $h_{t}$ , $s_{t}$ , and $k_{t}$ , by using the estimated conditional quantiles (ECQs) of $y_{t}$ . Clearly, provided $n$ different ECQs (that is, $n$ different $Y_{t,i}$ in (2.6)), our three quantiled conditional moments (QCMs) in (2.8) are easy-to-implement, since their computation only relies on the OLS estimator $\widehat{\bm{\theta}}_{t}$ .

2.2 The conditional mean issue

Using $\widehat{\beta}_{0t}$ in (2.7), we can estimate $\beta_{0t}$ but not $\mu_{t}$ due to the presence of $\gamma_{t}$ . Hence, we are unable to form the quantiled conditional mean of $y_{t}$ to estimate $\mu_{t}$ . Interestingly, our way to compute $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ does not require an estimator of $\mu_{t}$ . This is far beyond our expectations, since normally we have to first estimate (or model) $\mu_{t}$ and then $h_{t}$ , $s_{t}$ , and $k_{t}$ .

Although the estimation of $\mu_{t}$ is not required to compute $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ , knowing the dynamic structure of $\mu_{t}$ is still important in practice. Note that $\mu_{t}$ is often assumed to be an unknown constant in accordance with the efficient market hypothesis, and this constant assumption can be examined by the consistent spectral tests for the martingale difference hypothesis (MDH) in Escanciano and Velasco (2006). If the constant assumption is rejected by these tests, the dynamic structure of $\mu_{t}$ manifests and is usually specified by a linear model (e.g., the autoregressive moving-average model) or a nonlinear model (e.g., the threshold autoregressive model); see Fan and Yao (2003) and Tsay (2005) for surveys. In this case, the model correctly specifies the dynamic structure of $\mu_{t}$ if and only if its model error is an MD sequence, a statement which can be consistently checked by two spectral tests for the MDH on unobserved model errors in Escanciano (2006). Hence, it is usually tractable for practitioners to come up with a valid parametric model for $\mu_{t}$ in most of applications.

2.3 The moment constraints issue

Note that $\mu_{t}$ , $h_{t}$ , $s_{t}$ , and $k_{t}$ can be expressed in terms of the first four non-central moments $m_{1t}$ , $m_{2t}$ , $m_{3t}$ , and $m_{4t}$ of $Y_{t}$ , where $m_{jt}=E(Y_{t}^{j}|\mathcal{F}_{t-1})$ . Therefore, the existence of $\mu_{t}$ , $h_{t}$ , $s_{t}$ , and $k_{t}$ is equivalent to that of $m_{1t}$ , $m_{2t}$ , $m_{3t}$ , and $m_{4t}$ , and the latter requires the existence of a non-decreasing function $F_{t}(\cdot)$ such that

m_{jt}=\int_{-\infty}^{\infty}x^{j}dF_{t}(x).

To ensure this existence, Theorem 12.a in Widder (1946) indicates that the following condition must hold for $m_{1t}$ , $m_{2t}$ , $m_{3t}$ , and $m_{4t}$ :

\det\begin{pmatrix}m_{0t}&m_{1t}\\ m_{1t}&m_{2t}\end{pmatrix}\geq 0\,\,\,\mbox{ and }\,\,\,\det\begin{pmatrix}m_{0t}&m_{1t}&m_{2t}\\ m_{1t}&m_{2t}&m_{3t}\\ m_{2t}&m_{3t}&m_{4t}\end{pmatrix}\geq 0.

(2.9)

By some direct calculations, it is not hard to see that condition (2.9) is equivalent to

h_{t}\geq 0\,\,\,\mbox{ and }\,\,\,k_{t}-s_{t}^{2}-1\geq 0.

(2.10)

Condition (2.10) above places two necessary moment constraints on $h_{t}$ , $s_{t}$ , and $k_{t}$ . When $h_{t}$ , $s_{t}$ , and $k_{t}$ are specified by some parametric models with unknown parameters, the first moment constraint usually can be easily handled, but the second moment constraint restricts the admission region of unknown parameters in a very complex way, so that the model estimation becomes quite inconvenient and unstable. This is the reason why the second moment constraint has been rarely taken into account in the literature except Jondeau and Rockinger (2003).

Impressively, the moment constraints issue above is not an obstacle for our QCMs, since the QCMs estimate the CMs directly at each fixed timepoint $t$ . In view of the relationship between the QCMs and $\widehat{\bm{\beta}}_{t}$ in (2.8), we know that the QCMs satisfy two constraints in (2.10) if and only if $\widehat{\beta}_{1t}^{2}\geq 0$ and $\widehat{\beta}_{1t}^{2}-18\widehat{\beta}_{2t}^{2}+12\widehat{\beta}_{1t}\widehat{\beta}_{3t}\geq 0$ . Since the constraint $\widehat{\beta}_{1t}^{2}\geq 0$ holds automatically, we indeed only need to check whether

\widehat{\beta}_{1t}^{2}-18\widehat{\beta}_{2t}^{2}+12\widehat{\beta}_{1t}\widehat{\beta}_{3t}\geq 0.

(2.11)

In practice, the constraint in (2.11) can be directly examined after the QCMs are computed. Our applications in Section 6 below show that this constraint holds at all examined timepoints. In other applications, if the constraint in (2.11) does not hold at some timepoints $t$ , we can easily re-estimate $\bm{\theta}_{t}$ in (2.6) by the constrained least squares estimation method with the constraint $\beta_{1t}^{2}-18\beta_{2t}^{2}+12\beta_{1t}\beta_{3t}\geq 0$ , so that the resulting QCMs satisfy the constraint in (2.11) automatically at these timepoints.

3 Asymptotics

This section studies the asymptotics of $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ at a fixed timepoint $t$ . Let $\overset{p}{\longrightarrow}$ denote convergence in probability. To derive the consistency of $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ in (2.8), the following assumptions are needed.

Assumption 3.1.

$\bm{M}_{n}\equiv\bm{Z}^{\prime}\bm{Z}/n$ is uniformly positive definite.

Assumption 3.2.

$\bm{Z}^{\prime}\bm{\varepsilon}_{t}/n\overset{p}{\longrightarrow}\boldsymbol{0}$ as $n\to\infty$ .

We offer some remarks on the aforementioned assumptions. Assumption 3.1 is regular for linear regression models, and it holds as long as $\bm{Z}$ is fully ranked (i.e., Rank( $\bm{Z}$ )=4). Because $\varepsilon_{t,i}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ}-\gamma_{t}$ , Assumption 3.2 is equivalent to

\displaystyle C_{t,\ast}+C_{t,\circ}-C_{t,\gamma}\overset{p}{\longrightarrow}\boldsymbol{0}\mbox{ as }n\to\infty,

(3.1)

where $C_{t,\ast}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\varepsilon_{t,i}^{\ast}$ , $C_{t,\circ}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\varepsilon_{t,i}^{\circ}$ , and $C_{t,\gamma}=\big{(}n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\big{)}\gamma_{t}$ . By law of large numbers for dependent and heteroscedastic data sequence (Andrews, 1988), it is reasonable to assert that $C_{t,\ast}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}E(\varepsilon_{t,i}^{\ast})+o_{p}(1)$ and $C_{t,\circ}=n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}E(\varepsilon_{t,i}^{\circ})+o_{p}(1)$ . Then, since $\varepsilon_{t,i}^{\bullet}=\varepsilon_{t,i}^{\ast}+\varepsilon_{t,i}^{\circ}$ , condition (3.1) holds if

\displaystyle\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})-\gamma_{t}\big{]}\longrightarrow\boldsymbol{0}\mbox{ as }n\to\infty.

(3.2)

Condition (3.2) reveals an important fact that the role of $\gamma_{t}$ is to offset the possible non-identification effect caused by the non-zero mean of $\varepsilon_{t,i}^{\bullet}$ . In other words, to achieve the identification, $\gamma_{t}$ should automatically tend to minimize the absolute difference

d_{n,t}\equiv\Big{|}\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})\big{]}-\Big{(}\frac{1}{n}\sum_{i=1}^{n}\bm{Z}_{i}\Big{)}\gamma_{t}\Big{|}

for large $n$ . Clearly, if $d_{n,t}\approx 0$ for large $n$ , Condition (3.2) holds automatically, so then Assumption 3.2 most likely holds.

Next, we study the behavior of $d_{n,t}$ in different cases. For the first case that $E(\varepsilon_{t,i}^{\bullet})\approx c_{t}$ for all $i$ , we have $d_{n,t}\approx 0$ with $\gamma_{t}=c_{t}$ . For the second case that $E(\varepsilon_{t,i}^{\bullet})\approx 0$ for most of $i$ , we also have $d_{n,t}\approx 0$ with $\gamma_{t}=0$ . For the third case that $n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\big{[}E(\varepsilon_{t,i}^{\bullet})\big{]}\approx\boldsymbol{\tau}_{t}\approx\boldsymbol{z}f_{t}$ and $n^{-1}\sum_{i=1}^{n}\bm{Z}_{i}\approx\boldsymbol{z}$ for large $n$ , we again have $d_{n,t}\approx 0$ with $\gamma_{t}=f_{t}$ . For other cases, we still have the chance to ensure $d_{n,t}\approx 0$ , depending on the behavior of $E(\varepsilon_{t,i}^{\bullet})$ across $i$ . In summary, the condition that $E(\varepsilon_{t,i}^{\bullet})\approx 0$ for all $i$ is not necessary for the validity of Assumption 3.2. This implies that the QCMs are able to have robust performances across diverse error scenarios, including situations where $E(\varepsilon_{t,i}^{\circ})$ has the large non-zero absolute values across $i$ (that is, the large biases of ECQs caused by the use of mis-specified conditional quantile models).

As expected, the condition that $d_{n,t}\approx 0$ for large $n$ may not always hold, and therefore, under certain circumstances, Assumption 3.2 may fail. For example, when the tail of $y_{t}$ becomes heavier, the impact of higher-order conditional moments in the Cornish-Fisher expansion becomes larger. In this case, $E(\varepsilon_{t,i}^{*})$ tends to have a more non-negligible exotic behavior across $i$ , so that it is harder to offset the non-identification effect via $\gamma_{t}$ , with the value of $d_{n,t}$ farther away from zero. Indeed, our simulation studies in Section 5 below indicate that the presence of non-negligible $\varepsilon_{t,i}^{*}$ has a larger impact on the consistency of $\widehat{k}_{t}$ than $\widehat{h}_{t}$ and $\widehat{s}_{t}$ , which perform robustly in terms of the heavy-tailedness of $y_{t}$ .

The following theorem establishes the consistency of $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ .

Theorem 3.1.

Suppose that Assumptions 3.1–3.2 hold. Then, $\widehat{\bm{\theta}}_{t}-\bm{\theta}_{t}\overset{p}{\longrightarrow}0$ as $n\to\infty$ . Consequently, $\widehat{h}_{t}-h_{t}\overset{p}{\longrightarrow}0$ , $\widehat{s}_{t}-s_{t}\overset{p}{\longrightarrow}0$ , and $\widehat{k}_{t}-k_{t}\overset{p}{\longrightarrow}0$ as $n\to\infty$ .

Remark 3.1.

If we assume $\bm{Z}^{\prime}\bm{\varepsilon}_{t}/n\overset{}{\longrightarrow}\boldsymbol{0}$ almost surely as $n\to\infty$ in Assumption 3.2, all of convergence results in Theorem 3.1 hold almost surely.

Remark 3.2.

In Theorem 3.1, we require large $n$ but not large $T$ . Certainly, a large $T$ may improve the performance of ECQs with small biases, however, it is not necessary for the validity of Assumption 3.2 and thus the consistency of QCMs.

Let $\overset{d}{\longrightarrow}$ denote convergence in distribution. We raise the following higher order assumption to replace Assumption 3.2:

Assumption 3.3.

$[\bm{V}_{t,n}]^{-1/2}\big{[}\bm{Z}^{\prime}\bm{\varepsilon}_{t}/\sqrt{n}\big{]}\overset{d}{\longrightarrow}N(0,\mathbf{I})$ as $n\to\infty$ , where $\mathbf{I}$ is an identity matrix, and $\bm{V}_{t,n}\equiv var(\bm{Z}^{\prime}\bm{\varepsilon}_{t}/\sqrt{n})$ is bounded and uniformly positive definite.

Assumption 3.3 is regular for proving the asymptotic normality of OLS estimator (see White (2001)). The theorem below shows that $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ are $\sqrt{n}$ -consistent but not asymptotically normal.

Theorem 3.2.

Suppose that Assumptions 3.1 and 3.3 hold. Then,

(\bm{M}_{n}^{-1}\bm{V}_{t,n}\bm{M}_{n}^{-1})^{-1/2}\sqrt{n}(\widehat{\bm{\theta}}_{t}-\bm{\theta}_{t})\overset{d}{\longrightarrow}N(0,\mathbf{I})

as $n\to\infty$ . Moreover, $\sqrt{n}(\widehat{h}_{t}-h_{t})=O_{p}(1)$ , $\sqrt{n}(\widehat{s}_{t}-s_{t})=O_{p}(1)$ , and $\sqrt{n}(\widehat{k}_{t}-k_{t})=O_{p}(1)$ , but $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ are not asymptotically normal.

Remark 3.3.

Although $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ are not asymptotically normal, the asymptotic normality of $\widehat{\bm{\theta}}_{t}$ demonstrates that the quantiled volatility (the second entry of $\widehat{\bm{\theta}}_{t}$ ), denoted by $\widehat{\sigma}_{t}$ , has the asymptotic normality property: $(\bm{\Gamma}\bm{M}_{n}^{-1}\bm{V}_{t,n}\bm{M}_{n}^{-1}\bm{\Gamma}^{\prime})^{-1/2}\sqrt{n}(\widehat{\sigma}_{t}-\sigma_{t})\overset{d}{\longrightarrow}N(0,1)$ as $n\to\infty$ , where $\bm{\Gamma}=(0,1,0,0)$ .

As shown before, the asymptotics of QCMs in Theorems 3.1–3.2 hold with no need to consider the specification of conditional mean and choose the specifications of conditional quantiles correctly. This important feature guarantees that the QCM method can largely reduce the risk of model mis-specification. The reason of this feature is that the QCM method is regression-based. Specifically, the conditional mean $\mu_{t}$ is absorbed into the intercept parameter $\beta_{0,t}$ , so that it has no impact on the QCMs; meanwhile, the biases of ECQs from the use of wrongly specified conditional quantile models can be aggregately offset by the term $\gamma_{t}$ , which is also nested in $\beta_{0,t}$ .

The aforementioned feature is accompanied by two limitations. The first limitation is that we need to estimate conditional quantiles $n$ different times. Fortunately, this limitation seems mild, since the quantile estimation commonly can be computed easily by the linear programming method and the resulting estimation biases can be tolerated by the QCM method to a large extent. The second limitation of the QCMs is that when the data are very heavy-tailed, the expansion error could have a non-negligible impact to cause the identification problem, particularly for the conditional kurtosis. This limitation seems an unavoidable consequence of the Cornish-Fisher expansion, and it could not be addressed by simply increasing the order of expansion (Lee and Lin, 1992).

4 Practical Implementations

To compute three QCMs in (2.8), we only need to input $n$ different ECQs $\widehat{Q}_{t}(\alpha_{i})$ , which can be computed in many different ways; see, for example, McNeil and Frey (2000), Kuester et al. (2006), Xiao and Koenker (2009) and the references therein for earlier works, and Koenker et al. (2017) and Zheng at al. (2018) for more recent ones. Without assuming any parametric specifications of CMs, Engle and Manganelli (2004) propose a general class of CAViaR models, which can decently specify the dynamic specification of conditional quantiles. Hence, the CAViaR models are appropriate choices for us to compute $\widehat{Q}_{t}(\alpha_{i})$ . Following Engle and Manganelli (2004), we consider four CAViaR models below:

(1)

Symmetric Absolute Value (SAV) model: $Q_{t}(\alpha)=\psi_{1,0}+\psi_{2,0}Q_{t-1}(\alpha)+\psi_{3,0}|y_{t-1}|$ ;
(2)

Asymmetric Slope (AS) model: $Q_{t}(\alpha)=\psi_{1,0}+\psi_{2,0}Q_{t-1}(\alpha)+\psi_{3,0}(y_{t-1})^{+}+\psi_{4,0}(y_{t-1})^{-}$ , where $(y_{t-1})^{+}=\text{max}(y_{t-1},0)$ and $(y_{t-1})^{-}=\text{min}(y_{t-1},0)$ ;
(3)

Indirect GARCH (IG) model: $Q_{t}(\alpha)=(\psi_{1,0}+\psi_{2,0}Q^{2}_{t-1}(\alpha)+\psi_{3,0}y^{2}_{t-1})^{1/2}$ ;
(4)

Adaptive (ADAP) model: $Q_{t}(\alpha)=Q_{t-1}(\alpha)+\psi_{1,0}\{[1+\text{exp}(N[y_{t-1}-Q_{t-1}(\alpha)])]^{-1}-\alpha\}$ , where $N$ is a positive finite number.

Each CAViaR model above can be estimated via the classical quantile estimation method (Koenker and Bassett, 1978). For simplicity, we take the SAV model as an illustrating example. Let $\bm{\psi}=(\psi_{1},\psi_{2},\psi_{3})^{\prime}$ be the unknown parameter of SAV model, and $\bm{\psi}_{0}=(\psi_{1,0},\psi_{2,0},\psi_{3,0})^{\prime}$ be its true value. As in Engle and Manganelli (2004), we estimate $\bm{\psi}_{0}$ by the quantile estimator $\widehat{\bm{\psi}}_{n}\equiv\arg\min_{\bm{\psi}}\rho_{\alpha}(y_{t}-Q_{t}(\alpha,\bm{\psi}))$ , where $\rho_{\alpha}(x)=x[\alpha-\mbox{I}(x<0)]$ is the check function, and $Q_{t}(\alpha,\bm{\psi})$ is defined in the same way as $Q_{t}(\alpha)$ in the SAV model with $\bm{\psi}_{0}$ replaced by $\bm{\psi}$ . Once $\widehat{\bm{\psi}}_{n}$ is obtained, we then take $\widehat{Q}_{t}(\alpha)\equiv Q_{t}(\alpha,\widehat{\bm{\psi}}_{n})$ as our ECQ.

Using the above CAViaR models, we can obtain different $\widehat{Q}_{t}(\alpha)$ . However, at some quantile levels $\alpha$ , some of models may be inadequate to specify the dynamic structure of $Q_{t}(\alpha)$ , resulting in invalid $\widehat{Q}_{t}(\alpha)$ . To screen out those invalid $\widehat{Q}_{t}(\alpha)$ before computing the QCMs, we consider the in-sample dynamic quantile (DQ) test $\mbox{DQ}_{IS}(\alpha)$ in Section 6 of Engle and Manganelli (2004). This test $\mbox{DQ}_{IS}(\alpha)$ aims to detect the inadequacy of CAViaR models by examining whether $\bar{\bm{X}}_{t}(\alpha)\mbox{Hit}_{t}(\alpha)$ has mean zero, where $\mbox{Hit}_{t}(\alpha)\equiv\mbox{I}(y_{t}<Q_{t}(\alpha))-\alpha$ and $\bar{\bm{X}}_{t}(\alpha)\equiv(\mbox{Hit}_{t-1}(\alpha),...,\mbox{Hit}_{t-4}(\alpha))^{\prime}$ . The testing idea of $\mbox{DQ}_{IS}(\alpha)$ relies on the fact that $\bar{\bm{X}}_{t}(\alpha)\mbox{Hit}_{t}(\alpha)$ has mean zero when the CAViaR model specifies the dynamic structure of $Q_{t}(\alpha)$ correctly. Based on the sequence $\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T}$ from a given CAViaR model, we can compute $\mbox{DQ}_{IS}(\alpha)$ and then its p-value $P(\xi>\mbox{DQ}_{IS}(\alpha))$ , where $\xi\sim\chi_{4}^{2}$ (a Chi-squared distribution with $4$ degrees of freedom). If the p-value of $\mbox{DQ}_{IS}(\alpha)$ is less than $p^{*}$ , the corresponding ECQs $\{\widehat{Q}_{t}(\alpha)\}$ are deemed to be invalid, so they are not included to compute the QCMs.

Below, we summarize our aforementioned procedure to compute the QCMs:

Procedure 4.1. (The steps to compute $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ )

1.

Obtain $\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T}$ at any quantile level $\alpha$ in $\bm{\alpha}$ based on any CAViaR model in $\mathcal{\bm{M}}$ , where $\bm{\alpha}\equiv[0.01:0.01:0.99]$ is a sequence of real numbers from $0.01$ to $0.99$ incrementing by $0.01$ , and $\mathcal{\bm{M}}\equiv\{\mbox{SAV},\,\,\mbox{AS},\,\,\mbox{IG},\,\,\mbox{ADAP}\}$ .
2.

Apply the DQ test $\mbox{DQ}_{IS}(\alpha)$ to each $\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T}$ from Step 1, and discard those $\{\widehat{Q}_{t}(\alpha)\}_{t=1}^{T}$ with p-values of $\mbox{DQ}_{IS}(\alpha)$ less than $p^{*}$ .
3.

Group all remaining $\widehat{Q}_{t}(\alpha)$ to form a set $\bm{S}_{t}$ at each given $t$ . Then, take the $i$ -th entry of $\bm{S}_{t}$ to be $Y_{t,i}$ in (2.6), and use its corresponding quantile level to compute $\bm{X}_{i}$ in (2.6), where $i=1,...,n_{0}$ , and $n_{0}$ is the size of $\bm{S}_{t}$ .
4.

Based on $\{Y_{t,i},\bm{X}_{i}\}_{i=1}^{n_{0}}$ from Step 3, compute the OLS estimator $\widehat{\bm{\theta}}_{t}$ in (2.7) and then the three QCMs $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ in (2.8).

In Procedure 4.1, the value of $n_{0}$ decreases with that of $p^{*}$ , and it achieves the upper bound $n=99\times 4$ when $p^{*}=0$ (i.e., no ECQs are discarded). Clearly, the choice of $p^{*}$ reveals a trade-off between estimation reliability and estimation efficiency in the QCM method, since a large value of $p^{*}$ enhances the reliability of $\widehat{Q}_{t}(\alpha)$ , but meanwhile, it reduces the efficiency of $\widehat{\bm{\theta}}_{t}$ as the value of $n_{0}$ becomes small. So far, how to choose $p^{*}$ optimally is unclear. Our additional simulation results in the supplementary materials show that $p^{*}=0.1$ is a good choice to balance the bias and variance of the estimation error of the QCMs. Hence, we recommend to take $p^{*}=0.1$ for the practical use.

5 Simulations

This section examines the finite sample performance of three QCMs $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ . For saving the space, some additional simulation results on the selection of $p^{*}$ are reported in the supplementary materials.

5.1 Simulations on the GARCH model

We examine the performance of QCMs when the data are generated by the benchmark GARCH model. Specifically, we generate 100 replications of sample size $T=1000$ from the following GARCH model (Bollerslev, 1986)

y_{t}=\eta_{t}\sigma_{t}\mbox{ and }\sigma_{t}^{2}=\omega+\alpha y_{t-1}^{2}+\beta\sigma_{t-1}^{2},

(5.1)

where the parameters are chosen as $\omega=0.1$ , $\alpha=0.1$ , and $\beta=0.8$ , and $\{\eta_{t}\}_{t=1}^{T}$ is an i.i.d. sequence with $\eta_{t}\sim N(0,1)$ and $ST_{\nu_{t}}$ . Here, $ST_{\nu}$ is the standardized $t_{\nu}$ distribution with mean zero and variance one, and $\nu_{t}$ is generated from the Uniform distribution over the interval $[5,20]$ . For each replication, we can easily see that when $\eta_{t}\sim N(0,1)$ , the true values of CMs and conditional quantiles of $y_{t}$ in (5.1) are

\mu_{t}=0,h_{t}=\sigma_{t}^{2},s_{t}=0,k_{t}=3,Q_{t}(\alpha)=\sigma_{t}\times\mbox{quantile of }N(0,1)\mbox{ at level }\alpha;

when $\eta_{t}\sim ST_{\nu_{t}}$ , those of $y_{t}$ are

\mu_{t}=0,h_{t}=\sigma_{t}^{2},s_{t}=0,k_{t}=\frac{6}{\nu_{t}-4}+3,Q_{t}(\alpha)=\sigma_{t}\times\mbox{quantile of }ST_{\nu_{t}}\mbox{ at level }\alpha.

Note that the expansion errors $\varepsilon_{t,i}^{*}$ from the Cornish-Fisher expansion in regression model (2.6) are negligible when the conditional distribution of $y_{t}$ is normal in the case of $\eta_{t}\sim N(0,1)$ , whereas $\varepsilon_{t,i}^{*}$ are non-negligible when the conditional distribution of $y_{t}$ is heavy-tailed in the case of $\eta_{t}\sim ST_{\nu_{t}}$ ; see Lee and Lin (1992).

Next, we generate the sequence $\{\widehat{Q}_{t}(\alpha_{i})\}$ at each $t$ to compute $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ in four different cases:

		$\displaystyle\mbox{Case 1 [No Error]:}\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i});$		(5.2)
		$\displaystyle\mbox{Case 2 [Error I]:}\,\,\,\,\,\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i})+\varepsilon_{t,i}^{\circ}\mbox{ with }\varepsilon_{t,i}^{\circ}\sim N(0,\sigma^{2}(\alpha_{i}));$
		$\displaystyle\mbox{Case 3 [Error II]:}\,\,\,\,\,\widehat{Q}_{t}(\alpha_{i})=Q_{t}(\alpha_{i})+\varepsilon_{t,i}^{\circ}\mbox{ with }\varepsilon_{t,i}^{\circ}\sim N(\mu(\alpha_{i}),\sigma^{2}(\alpha_{i}));$
		$\displaystyle\mbox{Case 4 [CAViaR]:}\,\,\,\,\widehat{Q}_{t}(\alpha_{i})\mbox{ is the entry of }\bm{S}_{t},$

where $\alpha_{i}\in\bm{\alpha}$ in Cases 1–3, $\bm{\alpha}$ and $\bm{S}_{t}$ are defined as in Procedure 4.1, $Q_{t}(\alpha)$ is the true value of conditional quantile of $y_{t}$ at level $\alpha$ , $\sigma^{2}(\alpha)=0.5\sigma^{2}+|\alpha-0.5|\sigma^{2}$ with $\sigma^{2}=0.2$ , and $\mu(\alpha)=\exp(-200\alpha)\mbox{I}(\alpha<0.5)+\exp(-(2-200\alpha))\mbox{I}(\alpha\geq 0.5)$ . Under the setting in Case 1, there has no approximation errors $\varepsilon_{t,i}^{\circ}$ in regression model (2.6). Under the settings in Cases 2 and 3, the approximation errors $\varepsilon_{t,i}^{\circ}$ are considered with different variances across $\alpha_{i}$ , where their means are zeros (Case 2) or non-zero values (Case 3) to mimic the scenario that $\widehat{Q}_{t}(\alpha_{i})$ is an unbiased or biased estimator of $Q_{t}(\alpha_{i})$ , respectively. Under the setting in Case 4, $\widehat{Q}_{t}(\alpha_{i})$ are computed by the CAViaR models, and this case mimics the real application scenario that the dynamic structures of $Q_{t}(\alpha_{i})$ are unknown and modelled by the CAViaR models.

Using the values of $\{\widehat{Q}_{t}(\alpha_{i})\}$ generated by (5.2), we compute three QCMs $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ for each replication, and then we measure their precision at each $t$ by considering

\displaystyle\Delta_{h,t}=\widehat{h}_{t}-h_{t},\,\,\,\Delta_{s,t}=\widehat{s}_{t}-s_{t},\,\,\,\Delta_{k,t}=\widehat{k}_{t}-k_{t},

(5.3)

where $h_{t}$ , $s_{t}$ , and $k_{t}$ are the true values of three CMs of $y_{t}$ . Based on the results of 100 replications, Figs 1 and 2 exhibit the boxplots of $\Delta_{h,t}$ , $\Delta_{s,t}$ , and $\Delta_{k,t}$ for $t=1,...,10$ under four different cases in (5.2), with $\eta_{t}\sim N(0,1)$ and $ST_{\nu_{t}}$ , respectively. The corresponding boxplots for $t\geq 11$ are similar and they thus are not reported here to make the figure more visible. From these two figures, our findings are as follows:

1.

When there has no approximation errors as in Case 1, $\widehat{h}_{t}$ and $\widehat{s}_{t}$ are very accurate estimators of $h_{t}$ and $s_{t}$ , regardless of the negligibility of the expansion errors. However, $\widehat{k}_{t}$ exhibits a large dispersion when the expansion errors are non-negligible in the case of $\eta_{t}\sim ST_{\nu_{t}}$ , even though it has a very small dispersion when the expansion errors are negligible in the case of $\eta_{t}\sim N(0,1)$ . This indicates that the expansion errors typically have a minor effect on $\widehat{h}_{t}$ and $\widehat{s}_{t}$ , while their impact on $\widehat{k}_{t}$ could be more significant.
2.

When there has approximation errors with zero means (or nonzero means) as in Case 2 (or Case 3), the median lines in the boxplots of $\Delta_{h,t}$ , $\Delta_{s,t}$ , and $\Delta_{k,t}$ are generally close to zero, irrespective of the negligibility of the expansion errors. These results suggest that the three QCMs are still consistent, even when $\widehat{Q}_{t}(\alpha_{i})$ are biased estimators of $Q_{t}(\alpha_{i})$ and the expansion errors are present. Compared with the results in the case of $\eta_{t}\sim N(0,1)$ , it is expected to see that the dispersion of all QCMs becomes larger in the case of $\eta_{t}\sim ST_{\nu_{t}}$ , and this phenomenon is more evident for $\widehat{k}_{t}$ .
3.

When $\widehat{Q}_{t}(\alpha_{i})$ are estimated by using the CAViaR method as in Case 4, the boxplots of $\Delta_{h,t}$ , $\Delta_{s,t}$ , and $\Delta_{k,t}$ show that the three QCMs are consistent across all considered situations. Surprisingly, when $\eta_{t}\sim ST_{\nu_{t}}$ , the performance of $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ in Case 4 is even better than that in Case 2 or 3. This observation is probably because the approximation errors and expansion errors are canceled out in Case 4, leading to smaller gross errors in regression model (2.6) and so more accurate QCMs consequently. Therefore, our QCM method can effectively accommodate the co-existence of approximation errors and expansion errors, which is frequently encountered in real data analysis.

Refer to caption — Figure 1: The boxplots of $\Delta_{h,t}$ , $\Delta_{s,t}$ , and $\Delta_{k,t}$ for $t=1,...,10$ under four different cases in (5.2), where the data are generated from the standard GARCH model in (5.1) with $\eta_{t}\sim N(0,1)$ . In each boxplot, the lines from top to bottom represent the maximum, first quartile, median, third quartile, and maximum of the data, and the outliers are plotted individually using the ‘o’ marker symbol.

5.2 Simulation on the ARMA–MN–GARCH model

Let $MN(\lambda_{1},\lambda_{2},\tau_{1},\tau_{2},\sigma_{1}^{2},\sigma_{2}^{2})$ denote a mixed normal (MN) distribution, the density of which is a mixture of two normal densities of $N(\tau_{1},\sigma_{1}^{2})$ and $N(\tau_{2},\sigma_{2}^{2})$ with the weighting probabilities $\lambda_{1}$ and $\lambda_{2}$ , respectively, where $\lambda_{i}\in(0,1)$ , $i=1,2$ , and $\lambda_{1}+\lambda_{2}=1$ . To examine the performance of QCMs in the presence of conditional mean specification and time-varying CMs, we generate 100 replications of sample size $T=1000$ from the following ARMA–MN–GARCH model (Haas et al., 2004)

y_{t}=a_{0}+a_{1}y_{t-1}+\epsilon_{t}+b_{1}\epsilon_{t-1},

(5.4)

where $\epsilon_{t}\sim MN(\lambda_{1},\lambda_{2},\tau_{1},\tau_{2},\sigma_{1,t}^{2},\sigma_{2,t}^{2})$ with $\tau_{2}=-(\lambda_{1}/\lambda_{2})\tau_{1}$ , $\sigma_{1,t}^{2}=c_{10}+c_{11}\epsilon_{t-1}^{2}+c_{12}\sigma_{1,t-1}^{2}$ , and $\sigma_{2,t}^{2}=c_{20}+c_{21}\epsilon_{t-1}^{2}+c_{22}\sigma_{2,t-1}^{2}$ , and the parameters are chosen as $\lambda_{1}=0.2$ , $\tau_{1}=0.4$ , $a_{0}=0.5$ , $a_{1}=0.4$ , $b_{1}=-0.3$ , $c_{10}=0.1$ , $c_{20}=0.3$ , $c_{11}=0.05$ , $c_{21}=0.1$ , $c_{12}=0.85$ , and $c_{22}=0.8$ . For each replication, we compute the true values of CMs and conditional quantiles of $y_{t}$ in model (5.4) as follows:

	$\displaystyle\mu_{t}$	$\displaystyle=a_{0}+a_{1}y_{t-1}+b_{1}\epsilon_{t-1},$
	$\displaystyle h_{t}$	$\displaystyle=\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})-(\lambda_{1}\tau_{1}+\lambda_{2}\tau_{2})^{2},$
	$\displaystyle s_{t}$	$\displaystyle=\frac{\lambda_{1}(\tau_{1}^{3}+3\tau_{1}\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{3}+3\tau_{2}\sigma_{2,t}^{2})}{[\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})]^{3/2}},$
	$\displaystyle k_{t}$	$\displaystyle=\frac{\lambda_{1}(\tau_{1}^{4}+6\tau_{1}^{2}\sigma_{1,t}^{2}+3\sigma_{1,t}^{4})+\lambda_{2}(\tau_{2}^{4}+6\tau_{2}^{2}\sigma_{2,t}^{2}+3\sigma_{2,t}^{4})}{[\lambda_{1}(\tau_{1}^{2}+\sigma_{1,t}^{2})+\lambda_{2}(\tau_{2}^{2}+\sigma_{2,t}^{2})]^{2}},$

and

Q_{t}(\alpha)=\mu_{t}+Q_{t}^{\epsilon}(\alpha),

where $Q_{t}^{\epsilon}(\alpha)$ satisfies $\lambda_{1}\Phi(Q_{t}^{\epsilon}(\alpha);\tau_{1},\sigma_{1,t}^{2})+\lambda_{2}\Phi(Q_{t}^{\epsilon}(\alpha);\tau_{2},\sigma_{2,t}^{2})=\alpha$ , and $\Phi(x;\tau,\sigma^{2})$ represents the normal distribution function with mean $\tau$ and variance $\sigma^{2}$ .

Based on the results of 100 replications, Fig 3 exhibits the boxplots of $\Delta_{h,t}$ , $\Delta_{s,t}$ , and $\Delta_{k,t}$ for $t=1,...,10$ under four different cases in (5.2). From this figure, we can reach the similar conclusions as in Fig 1. Hence, although $y_{t}$ has a heavier tail than normal under model (5.4), the presence of conditional mean specification does not affect the performance of QCMs.

6 A Real Application

In our empirical work, we consider the log-return (in percentage) series of three exchange rates, including the AUD to USD (AUD/USD), NZD to USD (NZD/USD), and CAD to USD (CAD/USD). We denote each log-return series by $\{y_{1},...,y_{T}\}$ , which are computed from January 1, 2009 to April 20, 2023. See Table 1 for some basic descriptive statistics of all three log-return series. Below, we compute the three QCMs of each log-return series, and then use these QCMs to study the “News impact curve” (NIC).

Table 1: Descriptive statistics for three return series.

	AUD/USD	NZD/USD	CAD/USD
Sample size	3730	3730	3730
Sample mean	$-$ 0.0012	0.0015	$-$ 0.0027
Sample variance	0.4374	0.4612	0.2056
Sample skewness	$-$ 0.4160	$-$ 0.4485	$-$ 0.0533
Sample kurtosis	7.0748	7.1818	5.3841

6.1 The three QCMs of return series

Following the steps in Procedure 4.1, we compute the three QCMs $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ of each log-return series, and report their basic descriptive statistics in Table 2, where the constraint (2.11) holds for all of the computed QCMs. From Tables 1 and 2, we find that for each return series, the mean of $\widehat{h}_{t}$ (or $\widehat{s}_{t}$ ) is close to the corresponding sample variance (or skewness), whereas the mean of $\widehat{k}_{t}$ is much smaller than the corresponding sample kurtosis. These findings are expected, since extreme returns can affect the sample kurtosis for a prolonged period of time, but their impact on $\widehat{k}_{t}$ decays exponentially over time.

Table 2: Descriptive statistics for the three QCMs of three return series.

		AUD/USD	NZD/USD	CAD/USD
$\widehat{h}_{t}$	Mean	0.5400	0.5821	0.2951
	Maximum	3.0661	2.7224	1.2098
	Minimum	0.1380	0.1537	0.0784
	Ljung-Box^†	0.0000	0.0000	0.0000
$\widehat{s}_{t}$	Mean	$-$ 0.2051	$-$ 0.1274	$-$ 0.1149
	Maximum	0.0463	0.1251	0.1230
	Minimum	$-$ 0.5176	$-$ 0.4867	$-$ 0.3951
	Ljung-Box	0.0000	0.0000	0.0000
$\widehat{k}_{t}$	Mean	3.7057	3.5129	3.6003
	Maximum	5.0147	4.7127	4.6095
	Minimum	2.8841	2.6247	2.8776
	Ljung-Box	0.0000	0.0000	0.0000

•

$\dagger$ The results are the p-values of Ljung-Box test (Ljung and Box, 1978).

Next, we check the validity of QCMs via a similar method as in Gu et al. (2020). Denote $\alpha^{h}=E(e_{t}^{h})$ , $\alpha^{s}=E(e_{t}^{s})$ , and $\alpha^{k}=E(e_{t}^{k})$ , where $e_{t}^{h}=(y_{t}-\mu_{t})^{2}-h_{t}$ , $e_{t}^{s}=[(y_{t}-\mu_{t})/\sqrt{h_{t}}]^{3}-s_{t}$ , and $e_{t}^{k}=[(y_{t}-\mu_{t})/\sqrt{h_{t}}]^{4}-k_{t}$ . Based on the estimates $\widehat{e}_{t}^{h}=(y_{t}-\widehat{\mu}_{t})^{2}-\widehat{h}_{t}$ , $\widehat{e}_{t}^{s}=[(y_{t}-\widehat{\mu}_{t})/\sqrt{\widehat{h}_{t}}]^{3}-\widehat{s}_{t}$ , and $\widehat{e}_{t}^{k}=[(y_{t}-\widehat{\mu}_{t})/\sqrt{\widehat{h}_{t}}]^{4}-\widehat{k}_{t}$ , we utilize the Student’s t tests $\mathbb{T}^{h}$ , $\mathbb{T}^{s}$ , and $\mathbb{T}^{k}$ to test the null hypotheses $\mathbb{H}^{h}$ : $a^{h}=0$ , $\mathbb{H}^{s}$ : $a^{s}=0$ , and $\mathbb{H}^{k}$ : $a^{k}=0$ , respectively. Here, $\widehat{\mu}_{t}$ is the estimate of conditional mean, and it is computed based on the mean specifications in Section 6.2 below. If $\mathbb{H}^{h}$ is not rejected by $\mathbb{T}^{h}$ at the significance level $\alpha^{*}$ , then it is reasonable to conclude that $\widehat{h}_{t}$ is valid. Similarly, the validity of $\widehat{s}_{t}$ and $\widehat{k}_{t}$ can be examined by using $\mathbb{T}^{s}$ and $\mathbb{T}^{k}$ . Table 3 reports the p-values of $\mathbb{T}^{h}$ , $\mathbb{T}^{s}$ , and $\mathbb{T}^{k}$ for all three exchange rates, and the results imply that all QCMs are valid at the significance level $5\%$ .

Table 3: The p-values of

\mathbb{T}^{h}

\mathbb{T}^{s}

, and

\mathbb{T}^{k}

for checking the validity of QCMs.

	AUD/USD	NZD/USD	CAD/USD
$\mathbb{T}^{h}$	0.4543	0.6207	0.1306
$\mathbb{T}^{s}$	0.3250	0.2634	0.3901
$\mathbb{T}^{k}$	0.8321	0.3900	0.6425

After checking the validity of the QCMs, we further plot the QCMs of all three return series during a sub-period from January 1, 2020 to May 1, 2020 in Fig 4. This sub-period deserves a detailed study, since it covers the 2020 stock market crash caused by the COVID-19 pandemic. For ease of visualization, the plots of all computed QCMs for the entire examined time period are not given here but available upon request. From Fig 4, we can have the following interesting findings:

1.

Starting from March 9, there is a rapid rising trend for both $\widehat{h}_{t}$ and $\widehat{k}_{t}$ while an apparent decline trend for $\widehat{s}_{t}$ in all of examined exchange rates. These observed trends demonstrate that the volatility risk kept rising sharply in currency markets, and synchronically, the tail risk to have extremely negative returns kept increasing substantially to make things even worse. The aforementioned phenomenon is not surprising, given the global impact of the COVID-19 pandemic outbreak in early 2020, which led to the depreciation of exchange rates across almost all countries afterwards.
2.

After March 19 or March 20, all of $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , and $\widehat{k}_{t}$ exhibited opposite trends. This is very likely because the Federal Reserve and many central banks announced their financial rescue plans on March 17. Hence, the trend reversal sheds light on the effectiveness of the issued financial plans in rescuing stock markets.

Overall, we find that the COVID-19 pandemic has a perilous impact on the examined three exchange rates, and the financial rescue plans are effective in reducing the values of conditional variance and kurtosis, while increasing the values of conditional skewness.

6.2 The study on NICs

The NIC initiated by Engle and Ng (1993) aims to study how past shocks (or news) $\{\epsilon_{i}\}_{i\leq t-1}$ affect the present conditional variance $h_{t}$ by assuming

h_{t}=\theta_{h}h_{t-1}+g_{h}(\epsilon_{t-1}),

(6.1)

where $\epsilon_{t}\equiv y_{t}-\mu_{t}$ is the collective shock at $t$ , $\theta_{h}\in(0,1)$ is an unknown parameter to measure the persistence of $h_{t}$ , and $g_{h}(\cdot)$ is the NIC function for $h_{t}$ that has a specific parametric form. For example, researchers commonly assume that

	$\displaystyle g_{h}(x)$	$\displaystyle=\vartheta_{h,0}+\vartheta_{h,1}x^{2},$		(6.2)
	$\displaystyle g_{h}(x)$	$\displaystyle=\vartheta_{h,0}+\vartheta_{h,1}x^{2}+\vartheta_{h,2}x^{2}\mbox{I}(x<0),$		(6.3)

where the specifications of $g_{h}(\cdot)$ in (6.2) and (6.3) lead to the standard GARCH (Bollerslev, 1986) and GJR-GARCH (Glosten et al., 1993) models for $h_{t}$ , respectively. Similar to the NIC for $h_{t}$ in (6.1), we can follow the ideas of Harvey and Siddique (1999) and León et al. (2005) to consider the NICs for $s_{t}$ and $k_{t}$ :

	$\displaystyle s_{t}=\theta_{s}s_{t-1}+g_{s}(\varrho_{t-1}),$		(6.4)
	$\displaystyle k_{t}=\theta_{k}k_{t-1}+g_{k}(\varrho_{t-1}),$		(6.5)

where $\varrho_{t}\equiv\epsilon_{t}/\sqrt{h_{t}}$ is the re-scaled collective shock at $t$ , $\theta_{s}\in(-1,1)$ and $\theta_{k}\in(0,1)$ are two unknown parameters to measure the persistence of $s_{t}$ and $k_{t}$ , respectively, and $g_{s}(\cdot)$ and $g_{k}(\cdot)$ are the NIC functions for $s_{t}$ and $k_{t}$ , respectively. As for $g_{h}(\cdot)$ , $g_{s}(\cdot)$ and $g_{k}(\cdot)$ are often assumed to have certain parametric forms, such as

	$\displaystyle g_{s}(x)$	$\displaystyle=\vartheta_{s,0}+\vartheta_{s,1}x^{3},$		(6.6)
	$\displaystyle g_{k}(x)$	$\displaystyle=\vartheta_{k,0}+\vartheta_{k,1}x^{4};$		(6.7)

see, for example, Harvey and Siddique (1999) and León et al. (2005). Since $h_{t}$ , $s_{t}$ , $k_{t}$ , $\epsilon_{t}$ , and $\varrho_{t}$ are generally unobserved, all of the unknown parameters in (6.1) and (6.4)–(6.5) have to be estimated by specifying some parametric models on $y_{t}$ that account for the conditional variance, skewness, and kurtosis simultaneously.

However, so far the parametric forms of $g_{h}(\cdot)$ , $g_{s}(\cdot)$ , and $g_{k}(\cdot)$ are chosen in an ad-hoc rather than a data-driven manner. Intuitively, if $g_{h}(\cdot)$ , $g_{s}(\cdot)$ , and $g_{k}(\cdot)$ can be estimated non-parametrically, we are able to get some useful information on their parametric forms. Motivated by this idea, we replace $h_{t}$ , $s_{t}$ , $k_{t}$ , $\epsilon_{t-1}$ , and $\varrho_{t-1}$ in (6.1) and (6.4)–(6.5) with $\widehat{h}_{t}$ , $\widehat{s}_{t}$ , $\widehat{k}_{t}$ , $\widehat{\epsilon}_{t-1}$ , and $\widehat{\varrho}_{t-1}$ , respectively, where $\widehat{\epsilon}_{t}=y_{t}-\widehat{\mu}_{t}$ and $\widehat{\varrho}_{t}=\widehat{\epsilon}_{t}/\sqrt{\widehat{h}_{t}}$ with $\widehat{\mu}_{t}$ being an estimator of $\mu_{t}$ . After this replacement, we can get the following models:

$\displaystyle\widehat{h}_{t}$	$\displaystyle=\theta_{h}\widehat{h}_{t-1}+g_{h}(\widehat{\epsilon}_{t-1})+\varsigma_{h,t},$	(6.8)
$\displaystyle\widehat{s}_{t}$	$\displaystyle=\theta_{s}\widehat{s}_{t-1}+g_{s}(\widehat{\varrho}_{t-1})+\varsigma_{s,t},$	(6.9)
$\displaystyle\widehat{k}_{t}$	$\displaystyle=\theta_{k}\widehat{k}_{t-1}+g_{k}(\widehat{\varrho}_{t-1})+\varsigma_{k,t},$	(6.10)

where $\varsigma_{h,t}$ , $\varsigma_{s,t}$ , and $\varsigma_{k,t}$ are model errors caused by the replacement. Since the QCMs are most likely consistent estimators of CMs, all model errors are expected to have desirable properties for valid model estimations when the parametric form of $\mu_{t}$ is correctly specified.

To get the correct specification of $\mu_{t}$ , we apply two Cramér-von Mises tests $D_{T,I}^{2}$ and $D_{T,C}^{2}$ in Escanciano (2006) to check whether the assumed form of $\mu_{t}$ is correctly specified, where the p-values of $D_{T,I}^{2}$ and $D_{T,C}^{2}$ are computed via the bootstrap method in Escanciano (2006). Since the strong autocorrelations are detected for the three return series, we adopt an order $p$ threshold autoregressive (TAR( $p$ )) model (Tong, 1978) with threshold variable being zero and delay variable being one to fit these three return series. After dismissing insignificant parameters, the AUD/USD, NZD/USD, and CAD/USD exchange rates are fitted by the TAR(5), TAR(9), and TAR(6) models, respectively. The p-values of $D_{T,I}^{2}$ and $D_{T,C}^{2}$ in Table 4 indicate that these TAR models are correct specifications for the three return series at the significance level 5%.

Table 4: The p-values of

D_{T,I}^{2}

and

D_{T,C}^{2}

for checking the conditional mean specification.

	AUD/USD	NZD/USD	CAD/USD
$D_{T,I}^{2}$	0.7700	0.5400	0.4200
$D_{T,C}^{2}$	0.7200	0.4500	0.4100

After estimating the chosen specifications of $\mu_{t}$ above by the least squares method, we are able to obtain $\widehat{\epsilon}_{t-1}$ and $\widehat{\varrho}_{t-1}$ . Define $K_{b}(\cdot)=K(\cdot/b)/b$ , where $K(\cdot)$ is Gaussian kernel function and $b$ is the bandwidth. Then, based on the sample sequence $\{(\widehat{h}_{t},\widehat{h}_{t-1},\widehat{\epsilon}_{t-1})\}_{t=2}^{T}$ , we use the method in Robinson (1988) to estimate $\theta_{h}$ by

\widehat{\theta}_{h}=\Big{\{}\sum_{t=2}^{T}[\widehat{h}_{t-1}-\phi_{1}(\widehat{\epsilon}_{t-1})]^{2}\Big{\}}^{-1}\Big{\{}\sum_{t=2}^{T}[\widehat{h}_{t-1}-\phi_{1}(\widehat{\epsilon}_{t-1})][\widehat{h}_{t}-\phi_{2}(\widehat{\epsilon}_{t-1})]\Big{\}},

where

\phi_{1}(\cdot)=\frac{\sum_{s=2}^{T}K_{b_{1}}(\cdot-\widehat{\epsilon}_{s-1})\widehat{h}_{s-1}}{\sum_{s=2}^{T}K_{b_{1}}(\cdot-\widehat{\epsilon}_{s-1})}\mbox{ and }\phi_{2}(\cdot)=\frac{\sum_{s=2}^{T}K_{b_{2}}(\cdot-\widehat{\epsilon}_{s-1})\widehat{h}_{s}}{\sum_{s=2}^{T}K_{b_{2}}(\cdot-\widehat{\epsilon}_{s-1})},

and the values of $b_{1}$ and $b_{2}$ are chosen by the conventional cross-validation method. Next, we estimate $g_{h}(\cdot)$ non-parametrically by $\widehat{g}_{h}(\cdot)=\sum_{s=2}^{T}K_{b_{3}}(\cdot-\widehat{\epsilon}_{s-1})R_{h,s}/\sum_{s=2}^{T}K_{b_{3}}(\cdot-\widehat{\epsilon}_{s-1})$ , where $R_{h,t}=\widehat{h}_{t}-\widehat{\theta}_{h}\widehat{h}_{t-1}$ , and the value of $b_{3}$ is chosen by the cross-validation method. Similarly, based on the sample sequences $\{(\widehat{s}_{t},\widehat{s}_{t-1},\widehat{\varrho}_{t-1})\}_{t=2}^{T}$ and $\{(\widehat{k}_{t},\widehat{k}_{t-1},\widehat{\varrho}_{t-1})\}_{t=2}^{T}$ , we estimate $g_{s}(\cdot)$ and $g_{k}(\cdot)$ non-parametrically by $\widehat{g}_{s}(\cdot)$ and $\widehat{g}_{k}(\cdot)$ , respectively.

Fig 5 plots the non-parametric fitted models $\widehat{g}_{h}(\cdot)$ , $\widehat{g}_{s}(\cdot)$ , and $\widehat{g}_{k}(\cdot)$ for all three return series. From this figure, we find that the form of $g_{h}(\cdot)$ in (6.2) or (6.3) matches $\widehat{g}_{h}(\cdot)$ quite well, whereas the forms of $g_{s}(\cdot)$ in (6.6) and $g_{k}(\cdot)$ in (6.7) exhibit a large deviation from $\widehat{g}_{s}(\cdot)$ and $\widehat{g}_{k}(\cdot)$ , respectively, in all three cases. The same conclusion can be reached in view of the results of adjusted R² for all fitted models in Table 5.

Table 5: The values of adjusted R² for the fitted models (6.8)–(6.10).

	AUD/USD	NZD/USD	CAD/USD
	Panel A: Model (6.8)
$g_{h}(\cdot)\sim(\ref{6_model_1})$	0.9215	0.9270	0.9551
$g_{h}(\cdot)\sim(\ref{6_model_2})$	0.9363	0.9357	0.9628
	Panel B: Model (6.9)
$g_{s}(\cdot)\sim(\ref{6_model_3})$	0.0920	0.1316	0.4754
	Panel C: Model (6.10)
$g_{k}(\cdot)\sim(\ref{6_model_4})$	0.1416	0.3008	0.2702

7 Concluding Remarks

This paper estimates the three CMs (with respect to variance, skewness, and kurtosis) by the corresponding QCMs, which are easily computed from the OLS estimator of a linear regression model constructed by the ECQs. The QCM method builds on the Cornish-Fisher expansion, which essentially transforms the estimation of CM to that of conditional quantile. Such transformation brings us two attractive advantages over the parametric GARCH-type methods: First, the QCM method bypasses the investigation of conditional mean and allows the mis-specified conditional quantile models, due to its regression-based feature; Second, the QCM method has stable estimation results, since it does not involve any complex nonlinear constraint on the admission region of parameters in conditional quantile models. These two advantages come with two limitations. The first limitation is that we need to do the conditional quantile estimation $n$ different times. But this seems neither a computational burden nor a theoretical obstruct. The second limitation is that when the data are more heavy-tailed, the CF expansion is unavoidable to be less accurate, leading to more non-negligible expansion error. Although this limitation does not affect the consistency of the QCMs in general, it makes the QCMs (especially the quantiled conditional kurtosis) have a larger dispersion when the data are more heavy-tailed.

Notably, the QCM method is a supervised learning procedure without assuming any distribution on the returns. This important supervised learning feature makes the QCM method have a substantially computational advantage over the GARCH-type methods to estimate conditional variance, skewness, and kurtosis, when the dimension of return is large. See Zhu et al. (2023) for an in-depth discussion on this context and an innovative method for big portfolio selections based on the learned conditional higher-moments from the QCM method.

Finally, we should mention that the existing parametric methods commonly only work for stationary data and their extension to deal with more complex data environments seems challenging in terms of methodology and computation. In contrast, the QCM method could be applicable in complex data environments, as long as the ECQs are fairly provided. For example, the QCMs could adapt to the mixed categorical and continuous data or locally stationary data when the ECQs are computed by the method in Li and Racine (2008) or Zhou and Wu (2009), respectively. In addition, the useful information from exogenous variables and conditional quantiles of other variables can be easily embedded to the QCMs through the channels of the ECQs as done in Härdle et al. (2016) and Tobias and Brunnermeier (2016). Since the QCMs are computed at each fixed timepoint, the QCM method also allows us to pay special attention to the CMs during a specific time period by employing the methods in Cai (2002) and Xu (2013) to compute the ECQs. On the whole, the QCM method exhibits a much wider application scope than the parametric ones, which so far have not offered a clear feasible manner to study the CMs under the above complex data environments.

References

Andrews (1988) Andrews, D. W. K. (1988). Laws of large numbers for dependent nonidentically distributed random variables. Econometric Theory 4, 458–467.
Bali et al. (2008) Bali, T. G., Mo, H. and Tang, Y. (2008). The role of autoregressive conditional skewness and kurtosis in the estimation of conditional VaR. Journal of Banking & Finance 32, 269–282.
Bollerslev (1986) Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327.
Brooks et al. (2005) Brooks, C., Burke, S. P., Heravi, S. and Persand, G. (2005). Autoregressive conditional kurtosis. Journal of Financial Econometrics 3, 399–421.
Cai (2002) Cai, Z. (2002). Regression quantiles for time series. Econometric Theory 18, 169–192.
Chunhachinda et al. (1997) Chunhachinda, P., Dandapani, K., Hamid, S. and Prakash, A. J. (1997). Portfolio selection and skewness: Evidence from international stock markets. Journal of Banking & Finance 21, 143–167.
Cornish and Fisher (1938) Cornish, E. A. and Fisher, R. A. (1938). Moments and cumulants in the specification of distributions. Revue de l’Institut international de Statistique 5, 307–320.
Engle (1982) Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50, 987–1007.
Engle and Manganelli (2004) Engle, R. F. and Manganelli, S. (2004). CAViaR: Conditional autoregressive value at risk by regression quantiles. Journal of Business & Economic Statistics 22, 367–381.
Engle and Ng (1993) Engle, R. F. and Ng, V. K. (1993). Measuring and testing the impact of news on volatility. Journal of Finance 48, 1749–1778.
Escanciano (2006) Escanciano, J. C. (2006). Goodness-of-fit tests for linear and nonlinear time series models. Journal of the American Statistical Association 101, 140–149.
Escanciano (2009) Escanciano, J. C. (2009). Quasi-maximum likelihood estimation of semi-strong GARCH models. Econometric Theory 25, 561–570.
Escanciano and Velasco (2006) Escanciano, J. C. and Velasco, C. (2006). Generalized spectral tests for the martingale difference hypothesis. Journal of Econometrics 134, 151–185.
Fan and Yao (2003) Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
Francq and Sucarrat (2022) Francq, C. and Sucarrat, G. (2022). Volatility estimation when the zero-process is nonstationary. Journal of Business & Economic Statistics 41, 53–66.
Francq and Thieu (2019) Francq, C. and Thieu, L. Q. (2019). QML inference for volatility models with covariates. Econometric Theory 35, 37–72.
Francq and Zakoïan (2019) Francq, C. and Zakoïan, J. M. (2019). GARCH Models: Structure, Statistical Inference and Financial Applications. John Wiley & Sons.
Glosten et al. (1993) Glosten, L. R., Jagannathan, R. and Runkle, D. E. (1993). On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance 48, 1779–1801.
Grigoletto and Lisi (2009) Grigoletto, M. and Lisi, F. (2009). Looking for skewness in financial time series. Econometrics Journal 12, 310–323.
Gu et al. (2020) Gu, S., Kelly, B. and Xiu, D. (2020). Empirical asset pricing via machine learning. Review of Financial Studies 33, 2223–2273.
Haas et al. (2004) Haas, M., Mittnik, S. and Paolella, M. S. (2004). Mixed normal conditional heteroskedasticity. Journal of Financial Econometrics 2, 211–250.
Hansen (1994) Hansen, B. E. (1994). Autoregressive conditional density estimation. International Economic Review 35, 705–730.
Härdle et al. (2016) Härdle, W. K., Wang, W. and Yu, L. (2016). TENET: Tail-Event driven NETwork risk. Journal of Econometrics 192, 499–513.
Harvey and Siddique (1999) Harvey, C. R. and Siddique, A. (1999). Autoregressive conditional skewness. Journal of Financial and Quantitative Analysis 34, 465–487.
Harvey and Siddique (2000) Harvey, C. R. and Siddique, A. (2000). Conditional skewness in asset pricing tests. Journal of Finance 55, 1263–1295.
Jondeau and Rockinger (2003) Jondeau, E. and Rockinger, M. (2003). Conditional volatility, skewness, and kurtosis: existence, persistence, and comovements. Journal of Economic Dynamics and Control 27, 1699–1737.
Jondeau et al. (2019) Jondeau, E., Zhang, Q. and Zhu, X. (2019). Average skewness matters. Journal of Financial Economics 134, 29–47.
Koenker and Bassett (1978) Koenker, R. and Bassett, G. (1978). Regression quantiles. Econometrica 46, 33–50.
Koenker et al. (2017) Koenker, R., Chernozhukov, V., He, X. and Peng, L. (2017). Handbook of Quantile Regression. Chapman & Hall/CRC.
Kuester et al. (2006) Kuester, K., Mittnik, S. and Paolella, M. S. (2006). Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics 4, 53–89.
Lee and Lin (1992) Lee, Y. S. and Lin, T. K. (1992). Algorithm AS 269: High order Cornish-Fisher expansion. Journal of the Royal Statistical Society: Series C 41, 233–240.
León and N̄íguez (2020) León, Á. and N̄íguez, T. M. (2020). Modeling asset returns under time-varying semi-nonparametric distributions. Journal of Banking & Finance 118, 105870.
León et al. (2005) León, Á., Rubio, G. and Serna, G. (2005). Autoregresive conditional volatility, skewness and kurtosis. Quarterly Review of Economics and Finance 45, 599–618.
Li and Racine (2008) Li, Q. and Racine, J. S. (2008). Nonparametric estimation of conditional CDF and quantile functions with mixed categorical and continuous data. Journal of Business & Economic Statistics 26, 423–434.
Ljung and Box (1978) Ljung, G. M. and Box, G. E. (1978). On a measure of lack of fit in time series models. Biometrika 65, 297–303.
McNeil and Frey (2000) McNeil, A. J. and Frey, R. (2000). Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach. Journal of Empirical Finance 7, 271–300.
Robinson (1988) Robinson, P. M. (1988). Root-N-consistent semiparametric regression. Econometrica 56, 931–954.
Rubinstein (1973) Rubinstein, M. E. (1973). A comparative statics analysis of risk premiums. Journal of Business 46, 605–615.
Samuelson (1970) Samuelson, P. A. (1970). The fundamental approximation theorem of portfolio analysis in terms of means, variances and higher moments. Review of Economic Studies 37, 537–542.
Sucarrat and Grønneberg (2022) Sucarrat, G. and Grønneberg, S. (2022). Risk estimation with a time-varying probability of zero returns. Journal of Financial Econometrics 20, 278–309.
Tobias and Brunnermeier (2016) Tobias, A. and Brunnermeier, M. K. (2016). CoVaR. American Economic Review 106, 1705–1741.
Tong (1978) Tong, H. (1978). On a threshold model. Pattern recognition and signal processing. (ed. C.H.Chen). Sijthoff and Noordhoff, Amsterdam.
Tsay (2005) Tsay, R. S. (2005). Analysis of Financial Time Series. John Wiley & Sons.
White (2001) White, H. (2001). Asymptotic Theory for Econometricians, Revised edition. San Diego: Academic Press.
Widder (1946) Widder, D. V. (1946). The Laplace Transform. Princeton University Press, Princeton, NJ.
Xiao and Koenker (2009) Xiao, Z. and Koenker, R. (2009). Conditional quantile estimation for generalized autoregressive conditional heteroscedasticity models. Journal of the American Statistical Association 104, 1696–1712.
Xu (2013) Xu, K. L. (2013). Nonparametric inference for conditional quantiles of time series. Econometric Theory 29, 673–698.
Zheng at al. (2018) Zheng, Y., Zhu, Q., Li, G. and Xiao, Z. (2018). Hybrid quantile regression estimation for time series models with conditional heteroscedasticity. Journal of the Royal Statistical Society: Series B 80, 975–993.
Zhou and Wu (2009) Zhou, Z. and Wu, W. B. (2009). Local linear quantile estimation for nonstationary time series. Annals of Statistics 37, 2696–2729.
Zhu et al. (2023) Zhu, Z., Zhang, N. and Zhu, K. (2023). Big portfolio selection by graph-based conditional moments method. Working paper. Available at “https://arxiv.org/abs/2301.11697”.