Bias correction and uniform inference
for the quantile density function

Grigory Franguridi Department of Economics, University of Southern California. Email: franguri@usc.edu

Abstract

For the kernel estimator of the quantile density function (the derivative of the quantile function), I show how to perform the boundary bias correction, establish the rate of strong uniform consistency of the bias-corrected estimator, and construct the confidence bands that are asymptotically exact uniformly over the entire domain $[0,1]$ . The proposed procedures rely on the pivotality of the studentized bias-corrected estimator and known anti-concentration properties of the Gaussian approximation for its supremum.

1 Introduction

The derivative of the quantile function, the quantile density (QD), has been long recognized as an important object in statistical inference.²²2This function is sometimes also called the sparsity function (Tukey, 1965). In particular, it arises as a factor in the asymptotically linear expansion for the quantile function (Bahadur, 1966; Kiefer, 1967), and hence may be used for asymptotically valid inference on quantiles (Csörgő and Révész, 1981a, b; Koenker, 2005).

Given its importance, several estimators of the QD have been proposed in the literature. The most widely used estimator is the kernel quantile density (KQD), originally developed by Siddiqui (1960) and Bloch and Gastwirth (1968) for the case of rectangular kernel, and generalized to arbitrary kernels by Falk (1986), Welsh (1988), Csörgő et al. (1991), and Jones (1992). This estimator is simply a smoothed derivative of the empirical quantile function, where smoothing is performed via convolution with a kernel function.

Similarly to the classical case of kernel density estimation, the KQD suffers from bias close to the boundary points $\{0,1\}$ of its domain $[0,1]$ , rendering the estimator inconsistent. To the best of my knowledge, no bias correction procedures have been developed for the QD.

In this paper, I show how to perform correction for the boundary bias, recovering strong uniform consistency for the resulting bias-corrected KQD (BC-KQD) estimator. The bias correction is computationally cheap and is based on the fact that the bias of the KQD is approximately equal to the integral of the localized kernel function, a quantity that only depends on the chosen kernel and bandwidth. I also develop an algorithm for construction of the uniform confidence bands around the QD on its entire domain $[0,1]$ . This procedure relies on the fact that the studentized BC-KQD exhibits an influence function that is pivotal. This makes it possible to calculate the critical values by simulating from either the known influence function or the studentized BC-KQD under an alternative (pseudo) distribution of the data.

The rest of the paper is organized as follows. Section 2 outlines the framework and defines the KQD estimator. Section 3 introduces the BC-KQD estimator and establishes its Bahadur-Kiefer expansion. Section 4 develops the uniform confidence bands based on the BC-KQD. Section 5 illustrates the performance of the confidence bands in a set of Monte Carlo simulations. Section 6 concludes. Proofs of theoretical results are given in the Appendix.

2 Setup and kernel quantile density estimator

The data consist of independent identically distributed draws $X_{1},\dots,X_{n}$ from a distribution on $\mathbb{R}$ with a cumulative distribution function (CDF) $F$ satisfying the following assumption.

Assumption 1 (Data generating process).

The distribution $F$ has compact support $[\underline{x},\bar{x}]$ and admits a density $f=F^{\prime}$ that is continuously differentiable and bounded away from zero and infinity on $[\underline{x},\bar{x}]$ .

Assumption 1 implies that the quantile density

\displaystyle q(u)\coloneqq\frac{dF^{-1}(u)}{du}=\frac{1}{f(F^{-1}(u))}

(1)

is continuously differentiable and bounded away from zero and infinity on the support $[\underline{x},\bar{x}]$ .

Let $X_{(1)}\leq\cdots\leq X_{(n)}$ be the order statistics of the sample $X_{1},\dots,X_{n}$ , and let $\hat{Q}$ denote the empirical quantile function,

\displaystyle\hat{Q}(u)\coloneqq\begin{cases}X_{(\left\lfloor{nu}\right\rfloor+1)},\quad u\in[0,1),\\ X_{(n)},\quad u=1,\end{cases}

(2)

The KQD estimator is defined as

\displaystyle\hat{q}_{h}(u)

\displaystyle\coloneqq\int_{0}^{1}K_{h}(u-z)\,d\hat{Q}(z)=\sum_{i=1}^{n-1}K_{h}\left(u-\frac{i}{n}\right)\left(X_{(i+1)}-X_{(i)}\right),\quad u\in[0,1],

(3)

where $K$ is a kernel function, $K_{h}(z)\coloneqq h^{-1}K\left(h^{-1}z\right)$ , and $h>0$ is bandwidth (see, e.g., Csörgő et al., 1991). We impose the following assumptions on the kernel and bandwidth.

Assumption 2 (Kernel function).

The kernel $K$ is a nonnegative function of bounded variation that is supported on $[-1/2,1/2]$ , symmetric around $0$ , and satisfies

\displaystyle\int_{\mathbb{R}}K(x)\,dx=1,\quad\int_{\mathbb{R}}K^{2}(x)\,dx<\infty.

(4)

Assumption 3 (Bandwidth, estimation).

The bandwidth $h=h_{n}$ is such that $h_{n}\to 0$ and

1.

$h_{n}^{-1}=o\left(n^{1/2}(\log n)^{-1}(\log\log n)^{-1/2}\log h^{-1}\right)$ ,
2.

$h_{n}=o\left(n^{-1/3}(\log h^{-1})^{-1/3}\right)$ .

Assumption 4 (Bandwidth, inference).

The bandwidth $h=h_{n}$ is such that $h_{n}\to 0$ and

1.

$h_{n}^{-1}=o\left(n^{1/2}(\log n)^{-2}(\log\log n)^{-1/2}\right)$ ,
2.

$h_{n}=o\left(n^{-1/3}(\log n)^{-1}\right)$ .

Assumption 2 is standard; boundedness of the total variation of $K$ ensures that the class

\displaystyle\mathcal{F}\coloneqq\left\{K\left(\frac{u-\cdot}{h}\right),\,\,u\in[0,1],h>0\right\}

(5)

is a bounded VC class of measurable functions, see, e.g., Nolan and Pollard (1987).

Assumptions 3 and 4 are essentially the same, up to the log terms in the bandwidth rates, with Assumption 3 being slightly weaker. Assumption 4.1 states that the bandwidth rate is large enough (slightly larger than $n^{-1/2}$ ) to guarantee that the smoothed remainder of the classical Bahadur-Kiefer expansion vanishes asymptotically, see the proof of 1 below. Assumption 4.2 imposes the undersmoothing bandwidth rate (slightly smaller than $n^{-1/3}$ ), which ensures that the smoothing bias disappears fast enough for the confidence bands to be valid, see the proof of Theorem 2 below.

3 Bias correction and Bahadur-Kiefer expansion

In this section, I introduce the bias-corrected estimator and develop its asymptotically linear expansion with an explicit a.s. uniform rate of the remainder (the Bahadur-Kiefer expansion).

To see the necessity of bias correction, note that, for $u$ close to the boundary, the kernel weights $K_{h}(u-i/n)$ , $i=1,\dots,n-1$ , do not approximately sum up to one, rendering the KQD $\hat{q}_{h}(u)$ inconsistent. Therefore, dividing the KQD by the sum of the kernel weights (or the corresponding integral of the kernel function) may eliminate the boundary bias. To this end, define

\displaystyle\psi_{h}(u)

\displaystyle\coloneqq\int_{0}^{1}K_{h}(u-z)\,dz=\int_{\max(u-h/2,0)}^{\min(u+h/2,1)}K_{h}(u-z)\,dz,\quad u\in[0,1].

(6)

For computational purposes, note that $\psi_{h}$ is symmetric around $1/2$ (i.e. $\psi_{h}(u)=\psi_{h}(1-u)$ for all $u\in[0,1]$ ), $\psi_{h}\in[1/2,1]$ and $\psi_{h}(u)=1$ for $u\in[h/2,1-h/2]$ . The bias-corrected KQD (BC-KQD) is then defined as

\displaystyle\hat{q}_{h}^{bc}(u)\coloneqq\frac{\hat{q}_{h}(u)}{\psi_{h}(u)}=\frac{\sum_{i=1}^{n-1}K_{h}\left(u-\frac{i}{n}\right)\left(X_{(i+1)}-X_{(i)}\right)}{\int_{0}^{1}K_{h}(u-z)\,dz},\quad u\in[0,1].

(7)

The following theorem establishes that the studentized BC-KQD is approximately equal to the centered kernel density estimator with an approximation error that converges to zero a.s. at an explicit uniform rate. Since this result resembles (and relies on) the classical asymptotically linear expansion for the quantile function (Bahadur, 1966; Kiefer, 1967), we call it the Bahadur-Kiefer expansion for the BC-KQD. Denote $U_{i}=F(X_{i})$ , $i=1,\dots,n.$

Theorem 1 (Bahadur-Kiefer expansion for the BC-KQD).

Suppose Assumptions 1 and 2 are satisfied and $h_{n}\to 0$ . Then the following representation holds uniformly in $u\in[0,1]$ ,

\displaystyle Z_{n}^{bc}(u)=-\mathbb{G}_{n}(u)+O_{a.s.}\left(n^{1/2}h^{3/2}+h\log h^{-1}+h^{-1/2}n^{-1/4}(\log n)^{1/2}(\log\log n)^{1/4}\right),

(8)

where

$\displaystyle Z_{n}^{bc}(u)$	$\displaystyle\coloneqq\frac{\sqrt{nh}\left(\hat{q}_{h}^{bc}(u)-q(u)\right)}{q(u)/\psi_{h}(u)},$	(9)
$\displaystyle\mathbb{G}_{n}(u)$	$\displaystyle\coloneqq\frac{1}{\sqrt{nh}}\sum_{i=1}^{n}\left[K\left(\frac{U_{i}-u}{h}\right)-\mathbf{E}K\left(\frac{U_{i}-u}{h}\right)\right]$	(10)
	$\displaystyle=\sqrt{nh}\cdot\frac{1}{n}\sum_{i=1}^{n}\left[K_{h}(U_{i}-u)-\psi_{h}(u)\right].$	(11)

This representation allows us to establish the exact rate of strong uniform consistency of the BC-KQD under a bandwidth that achieves undersmoothing (Assumption 3.2).

Corollary 1 (Strong uniform consistency of BC-KQD).

Suppose Assumptions 1, 2, and 3 hold. Then

\displaystyle\lim_{n\to\infty}\sqrt{\frac{nh_{n}}{2\log h_{n}^{-1}}}\sup_{u\in[0,1]}\left|\hat{q}_{h}^{bc}(u)-q(u)\right|=\left(\int_{\mathbb{R}}K^{2}(x)\,dx\right)^{1/2}\text{ a.s.}

(12)

One of the convenient features of the KQD (and BC-KQD) estimator is that its bandwidth has a natural scale $[0,1]$ which is independent of the data generating process. Hence, I put aside the choice of constant $c$ in the bandwidth $h=cn^{-\eta}$ and suggest setting $c=1$ .

Regarding the choice of the rate $\eta$ , ignoring the log terms, it is easy to establish the rate-optimal bandwidth, which is achieved whenever the rate of the smoothing bias $n^{1/2}h^{3/2}$ matches that of the remainder in the original Bahadur-Kiefer expansion $n^{-1/4}h^{-1/2}$ . It follows that the nearly-optimal bandwidth is

\displaystyle h_{n}^{opt}=O\left(n^{-3/8}\right).

(13)

Under this bandwidth, the exact rate of strong uniform convergence is

\displaystyle O\left(\frac{\log n}{n^{5/16}}\right),

(14)

which is just slightly worse than the familiar “cube-root” rate (Kim and Pollard, 1990).

4 Uniform confidence bands

Suppose we had access to valid approximations $c_{n,\tau}$ , $c_{n,\tau}^{abs}$ to the $\tau$ -quantiles of the random variables

	$\displaystyle W_{n}^{bc}$	$\displaystyle=\sup_{u\in[0,1]}Z_{n}^{bc}(u),$		(15)
	$\displaystyle W_{n}^{bc,abs}$	$\displaystyle=\sup_{u\in[0,1]}\left\|Z_{n}^{bc}(u)\right\|,$		(16)

respectively, in the sense that

	$\displaystyle\mathbb{P}(W_{n}^{bc}\leq c_{n,\tau})$	$\displaystyle=\tau+o(1),$		(17)
	$\displaystyle\mathbb{P}(W_{n}^{bc,abs}\leq c_{n,\tau}^{abs})$	$\displaystyle=\tau+o(1).$		(18)

Then the following confidence bands for $q(\cdot)$ would be asymptotically valid at the confidence level $1-\alpha$ :

1.

the one-sided CB

$\displaystyle\left[\frac{\hat{q}_{h}^{bc}(u)}{1+\frac{c_{n,1-\alpha}}{\psi_{h}(u)\sqrt{nh}}},\,+\infty\right),\quad u\in[0,1],$ (19)

the one-sided CB

\displaystyle\left(-\infty,\,\frac{\hat{q}_{h}^{bc}(u)}{1-\frac{c_{n,1-\alpha}}{\psi_{h}(u)\sqrt{nh}}}\right],\quad u\in[0,1],

(20)

the two-sided CB

\displaystyle q(u)\in\left[\frac{\hat{q}_{h}^{bc}(u)}{1+\frac{c_{n,1-\alpha/2}^{abs}}{\psi_{h}(u)\sqrt{nh}}},\,\,\frac{\hat{q}_{h}^{bc}(u)}{1-\frac{c_{n,1-\alpha/2}^{abs}}{\psi_{h}(u)\sqrt{nh}}}\right],\quad u\in[0,1].

(21)

I propose two ways of obtaining such approximate critical values, both making use of the pivotality of the studentized bias-corrected KQD $Z_{n}^{bc}(u)$ , see Theorem 1. I focus on the one-sided critical value $c_{n,\tau}$ for simplicity; the proofs for the two-sided critical value are analogous.

The first approach is to let $c_{n,\tau}$ be the $\tau$ -quantile of the random variable

\displaystyle W_{n}^{\mathbb{G}}=\sup_{u\in[0,1]}\mathbb{G}_{n}(u).

(22)

Since $\mathbb{G}_{n}$ is a known process, $c_{n,\tau}$ can be obtained easily by simulation. In principle, $c_{n,\tau}$ can be tabulated for different choices of the kernel $K$ and values of the sample size $n$ and the bandwidth $h$ .

The other approach is to let $c_{n,\tau}$ be the $\tau$ -quantile of the random variable

\displaystyle W_{n}^{U[0,1]}\coloneqq\sup_{u\in[0,1]}Z_{n}^{bc,U[0,1]}(u),

(23)

where $Z_{n}^{bc,U[0,1]}(u)$ is equal to $Z_{n}^{bc}(u)$ evaluated at a pseudo-sample $\tilde{X}_{1},\dots,\tilde{X}_{n}\sim U[0,1]$ in place of the original sample. For the uniform distribution, $q\equiv 1$ , and hence

\displaystyle Z_{n}^{bc,U[0,1]}(u)\coloneqq\frac{\sqrt{nh}(\tilde{q}_{h}^{bc}(u)-q(u))}{q(u)/\psi_{h}(u)}=\sqrt{nh}(\tilde{q}_{h}(u)-\psi_{h}(u)),

(24)

where $\tilde{q}_{n}(u)$ is the (non-bias-corrected) KQD calculated using the pseudo-sample, i.e.

\displaystyle\tilde{q}_{n}(u)=\sum_{i=1}^{n-1}K_{h}\left(u-\frac{i}{n}\right)\left(\tilde{X}_{(i+1)}-\tilde{X}_{(i)}\right),\quad u\in[0,1].

(25)

The following theorem establishes that the two aforementioned approximations to the critical values are valid, implying the asymptotic validity of the confidence bands. These confidence bands are centered at an AMSE-suboptimal estimator $\hat{q}_{h}^{bc}$ and are expected to shrink at a rate slightly slower than the minimax optimal rate, as noted by Chernozhukov et al. (2014a, p.1795). This is compensated for by the confidence bands exhibiting the coverage that is asymptotically exact.

Theorem 2 (Exactness of confidence bands).

Suppose Assumptions 1, 2, and 4 hold. Then

	$\displaystyle\lim_{n\to\infty}\sup_{t\in\mathbb{R}}\left\|\mathbb{P}\left(W_{n}^{bc}\leq t\right)-\mathbb{P}\left(W_{n}^{\mathbb{G}}\leq t\right)\right\|=0,$		(26)
	$\displaystyle\lim_{n\to\infty}\sup_{t\in\mathbb{R}}\left\|\mathbb{P}\left(W_{n}^{bc}\leq t\right)-\mathbb{P}\left(W_{n}^{U[0,1]}\leq t\right)\right\|=0,$		(27)

and hence the confidence bands (19), (20), and (21) are asymptotically exact.

5 Monte Carlo study

In this section I study the finite-sample behavior of the proposed confidence bands in a set of Monte Carlo simulations.

I consider the following distributions of the data, all supported on the interval $[0,1]$ : (i) uniform[0,1] distribution (ii) the distribution $N(1/2,1)$ truncated to $[0,1]$ (iii) the linear distribution with the PDF $f(x)=x+1/2$ , $x\in[0,1]$ . I set the nominal confidence level to be $1-\alpha\in\{0.8,0.9,0.95,0.99\}$ and the sample size $n\in\{100,500,1000,5000\}$ . The critical values are obtained by simulating $\mathbb{G}_{n}(u)$ and calculating the quantiles of its supremum on the grid $u\in\{0.005,0.015,0.02,\dots,0.995\}$ , with the number of simulations set to $20000$ (simulation results for the critical values based on $Z_{n}^{bc,U[0,1]}(u)$ are very similar, so I do not report them here). I use the kernel corresponding to the standard normal distribution truncated to $[-1/2,1/2]$ and the nearly-optimal bandwidth $h=cn^{-3/8}$ , where I set $c=1$ since the scale of the bandwidth is $[0,1]$ , see Section 3.

In Figure 1, included for illustration, I plot 100 independent realizations of the $90\%$ confidence bands for the linear distribution, along with the true quantile density (in blue). Table 1 contains simulated coverage values for the two-sided confidence bands. The coverage is almost invariant to the distribution of the data, but the size distortion tends to be smaller for higher nominal confidence levels.

Refer to caption — Figure 1: $90\%$ confidence bands for the quantile density (in blue) of the linear distribution with the PDF $f=x+0.5$ , $x\in[0,1]$ . Number of independent realizations of the bands $S=100$ , sample size $n=5000$ .

Confidence level	$0.8$	$0.9$	$0.95$	$0.99$
	Uniform distribution
$n=100$	0.891	0.936	0.962	0.986
$n=500$	0.881	0.943	0.966	0.990
$n=1000$	0.898	0.947	0.970	0.993
$n=5000$	0.907	0.949	0.976	0.996
	Linear distribution
$n=100$	0.891	0.929	0.956	0.987
$n=500$	0.878	0.936	0.961	0.989
$n=1000$	0.890	0.944	0.970	0.991
$n=5000$	0.914	0.949	0.976	0.996
	Truncated normal distribution
$n=100$	0.898	0.942	0.964	0.988
$n=500$	0.887	0.944	0.967	0.992
$n=1000$	0.905	0.950	0.972	0.993
$n=5000$	0.911	0.952	0.978	0.997

Table 1: Simulated coverage of the two-sided confidence bands

6 Conclusion

To the best of my knowledge, no boundary bias correction or uniform inference procedures have been developed for the quantile density (sparsity) function. In this paper, I develop such procedures, establish their validity and show in a set of Monte Carlo simulations that they perform reasonably well in finite samples. I hope that, even when the quantile density itself is not the main inference target, these results may be employed for improving the quality of inference for other statistical objects, including the quantile function.

References

Andreyanov and Franguridi (2022) Andreyanov, P. and G. Franguridi (2022): “Nonparametric inference on counterfactuals in first-price auctions,” Available at https://arxiv.org/pdf/2106.13856.pdf.
Bahadur (1966) Bahadur, R. R. (1966): “A note on quantiles in large samples,” The Annals of Mathematical Statistics, 37, 577–580.
Bloch and Gastwirth (1968) Bloch, D. A. and J. L. Gastwirth (1968): “On a simple estimate of the reciprocal of the density function,” The Annals of Mathematical Statistics, 39, 1083–1085.
Chernozhukov et al. (2014a) Chernozhukov, V., D. Chetverikov, and K. Kato (2014a): “Anti-concentration and honest, adaptive confidence bands,” The Annals of Statistics, 42, 1787–1818.
Chernozhukov et al. (2014b) ——— (2014b): “Gaussian approximation of suprema of empirical processes,” The Annals of Statistics, 42, 1564–1597.
Csörgő et al. (1991) Csörgő, M., L. Horváth, and P. Deheuvels (1991): “Estimating the quantile-density function,” in Nonparametric Functional Estimation and Related Topics, Springer, 213–223.
Csörgő and Révész (1981a) Csörgő, M. and P. Révész (1981a): Strong approximations in probability and statistics, Academic Press.
Csörgő and Révész (1981b) ——— (1981b): Two approaches to constructing simultaneous confidence bounds for quantiles, 176, Carleton University. Department of Mathematics and Statistics.
Falk (1986) Falk, M. (1986): “On the estimation of the quantile density function,” Statistics & Probability Letters, 4, 69–73.
Giné and Guillou (2002) Giné, E. and A. Guillou (2002): “Rates of strong uniform consistency for multivariate kernel density estimators,” in Annales de l’Institut Henri Poincare (B) Probability and Statistics, Elsevier, vol. 38, 907–921.
Jones (1992) Jones, M. C. (1992): “Estimating densities, quantiles, quantile densities and density quantiles,” Annals of the Institute of Statistical Mathematics, 44, 721–727.
Kiefer (1967) Kiefer, J. (1967): “On Bahadur’s representation of sample quantiles,” The Annals of Mathematical Statistics, 38, 1323–1342.
Kim and Pollard (1990) Kim, J. and D. Pollard (1990): “Cube root asymptotics,” The Annals of Statistics, 191–219.
Koenker (2005) Koenker, R. (2005): Quantile Regression, Econometric Society Monographs, Cambridge University Press.
Nolan and Pollard (1987) Nolan, D. and D. Pollard (1987): “U-processes: rates of convergence,” The Annals of Statistics, 780–799.
Siddiqui (1960) Siddiqui, M. M. (1960): “Distribution of quantiles in samples from a bivariate population,” Journal of Research of the National Bureau of Standards, 64, 145–150.
Stroock (1998) Stroock, D. W. (1998): A concise introduction to the theory of integration, Springer Science & Business Media.
Tukey (1965) Tukey, J. W. (1965): “Which part of the sample contains the information?” Proceedings of the National Academy of Sciences, 53, 127–134.
Welsh (1988) Welsh, A. (1988): “Asymptotically efficient estimation of the sparsity function at a point,” Statistics & Probability Letters, 6, 427–432.

Appendix

Appendix A Proof of Theorem 1 and 1

First, note that

	$\displaystyle q_{h}(u)$	$\displaystyle\coloneqq\int_{0}^{1}K_{h}(u-z)q(z)\,dz=\int_{0}^{1}K_{h}(u-z)\left(q(u)+q^{\prime}(\xi(u,z))(z-u)\right)\,dz$		(28)
		$\displaystyle=q(u)\psi_{h}(u)+r_{h}(u),$		(29)

where $r_{n}(u)=O(h)$ uniformly in $u\in[0,1]$ since $q$ is continuously differentiable on $[0,1]$ .

Therefore,

\displaystyle Z_{n}^{bc}(u)

\displaystyle\coloneqq\frac{\sqrt{nh}\left(\hat{q}_{h}^{bc}(u)-q(u)\right)}{q(u)/\psi_{h}(u)}=\frac{\sqrt{nh}\left(\hat{q}_{h}(u)-\psi_{h}(u)q(u)\right)}{q(u)}=Z_{n}^{c}(u)+r_{n}^{bc}(u),

(30)

where

	$\displaystyle Z_{n}^{c}(u)$	$\displaystyle\coloneqq\frac{\sqrt{nh}\left(\hat{q}_{h}(u)-q_{h}(u)\right)}{q(u)},$		(31)
	$\displaystyle r_{n}^{bc}(u)$	$\displaystyle=\frac{\sqrt{nh}r_{h}(u)}{q(u)}=O\left(n^{1/2}h^{3/2}\right)\text{ uniformly in }u\in[0,1].$		(32)

The result now follows from the asymptotically linear expansion of the process $Z_{n}^{c}$ ,

\displaystyle Z_{n}^{c}(u)=\mathbb{G}_{n}(u)+O_{a.s.}\left(h\log h^{-1}+h^{-1/2}n^{-1/4}(\log n)^{1/2}(\log\log n)^{1/4}\right),

(33)

This expansion is implied by the proof of Andreyanov and Franguridi (2022, Theorem 1). I reproduce this proof here for completeness.

A.1 Proof of the representation (33)

First, we need the following two lemmas concerning expressions that appear further in the proof.

Lemma 1.

Suppose that Assumptions 1 and 2 hold. Then, for every $u\in[0,1]$ ,

\displaystyle\int_{0}^{1}K_{h}(u-z)\,d\left(\hat{Q}(z)-Q(z)\right)=-\int_{0}^{1}\left(\hat{Q}(z)-Q(z)\right)\,dK_{h}(u-z)+R^{I}_{n}(u),

(34)

where $\sup_{u\in[0,1]}|R^{I}_{n}(u)|=O_{a.s.}\left(\frac{1}{nh}\right)$ .

Proof.

Denote $\hat{\psi}(z)=\hat{Q}(z)-Q(z)$ and note that $\hat{\psi}$ is a function of bounded variation a.s. Using integration by parts for the Riemann-Stieltjes integral (see e.g. Stroock, 1998, Theorem 1.2.7), we have

\displaystyle\int_{0}^{1}K_{h}(u-z)\,d\hat{\psi}(z)=-\int_{0}^{1}\hat{\psi}(z)\,dK_{h}(u-z)+K_{h}(u-1)\hat{\psi}(1)-K_{h}(u)\hat{\psi}(0)

(35)

To complete the proof, note that $\hat{\psi}(1)=X_{(n)}-\bar{x}=O_{a.s.}(n^{-1})$ , $\hat{\psi}(0)=X_{(1)}-\underline{x}=O_{a.s.}(n^{-1})$ , $|K_{h}(u-1)|\leq h^{-1}K(0)$ and $|K_{h}(u)|\leq h^{-1}K(0)$ . ∎

Lemma 2.

Suppose that Assumptions 1 and 2 hold. Then, for every $u\in[0,1]$ ,

\displaystyle\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)

\displaystyle=-\mathbb{G}_{n}(u)/\sqrt{nh}.

(36)

Proof.

Using integration by parts for the Riemann-Stieltjes integral (see e.g. Stroock, 1998, Theorem 1.2.7), we have

	$\displaystyle\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)$	$\displaystyle=-\int_{0}^{1}K_{h}(u-z)\,d\left[\hat{F}(Q(z))-z\right]+K_{h}(u-1)\left[\hat{F}(\bar{x})-1\right]+K_{h}(u)\hat{F}(\underline{x})$		(37)
		$\displaystyle=-\int_{0}^{1}K_{h}(u-z)\,d\left[\hat{F}(Q(z))-z\right],$		(38)

where we used the fact that $\hat{F}(\bar{x})=1$ a.s. and $\hat{F}(\underline{x})=0$ a.s. We further write

$\displaystyle\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)$	$\displaystyle=-\int_{0}^{1}K_{h}(u-z)\,d\left[\hat{F}(Q(z))-z\right]$	(39)
	$\displaystyle=-\int_{0}^{\bar{b}}K_{h}(u-F(x))\,d\left[\hat{F}(x)-F(x)\right]$	(40)
	$\displaystyle=-\frac{1}{n}\sum_{i=1}^{n}\left[K_{h}(u-F(b_{i}))-\mathbf{E}K_{h}(u-F(b_{i}))\right]$	(41)
	$\displaystyle=:-\mathbb{G}_{n}(u)/\sqrt{nh},$	(42)

where in the second equality we used the change of variables $x=Q(z)$ . ∎

We now proceed with the proof of representation (33).

Recall the classical Bahadur-Kiefer expansion (Bahadur, 1966; Kiefer, 1967),

	$\displaystyle\hat{Q}(u)-Q(u)$	$\displaystyle=-q(u)\left(\hat{F}(Q(u))-u\right)+r_{n}(u),$		(43)
	$\displaystyle\text{where }r_{n}(u)$	$\displaystyle=O_{a.s.}\left(n^{-3/4}\ell(n)\right)\text{ uniformly in }u\in[0,1],$		(44)

and $\ell(n)\coloneqq(\log n)^{1/2}(\log\log n)^{1/4}$ . Combine this expansion with Lemma 1 to obtain

$\displaystyle\hat{q}_{h}(u)-q_{h}(u)$	$\displaystyle=\int_{0}^{1}K_{h}(u-z)\,d\left[\hat{Q}(z)-Q(z)\right]$	(45)
	$\displaystyle=\int_{0}^{1}\left[\hat{Q}(z)-Q(z)\right]\,dK_{h}(u-z)+R_{n}^{I}(u)$	(46)
	$\displaystyle=\int_{0}^{1}q(z)(\hat{F}(Q(z))-z)\,dK_{h}(u-z)+\int_{0}^{1}R_{n}^{BK}(z)\,dK_{h}(u-z)+R_{n}^{I}(u).$	(47)

First term in (47).

Since $f$ is bounded away from zero, $|q^{\prime}|\leq M<\infty$ for some constant $M$ , and hence $|q(z)-q(u)|\leq M|z-u|$ . The first term in (47) can then be rewritten as

\displaystyle\int_{0}^{1}q(z)(\hat{F}(Q(z))-z)\,dK_{h}(u-z)

\displaystyle=q(u)\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)+R^{II}_{n}(u),

(48)

where

	$\displaystyle\left\|R_{n}^{II}(u)\right\|$	$\displaystyle=\left\|\int_{0}^{1}(q(z)-q(u))(\hat{F}(Q(z))-z)\,dK_{h}(u-z)\right\|$		(49)
		$\displaystyle\leq Mh\left\|\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)\right\|=Mh\left\|\mathbb{G}_{n}(u)/\sqrt{nh}\right\|,$		(50)

the last equality using Lemma 2. The process $\mathbb{G}_{n}$ has the strong uniform convergence rate $\log h^{-1}/\sqrt{nh}$ (see, e.g., Giné and Guillou, 2002), and hence

\displaystyle R_{n}^{II}(u)=O_{a.s.}\left(\frac{h\log h^{-1}}{\sqrt{nh}}\right)\text{ uniformly over }u\in[0,1].

(51)

Applying Lemma 2 to the first term in (48) allows us to rewrite

\displaystyle\int_{0}^{1}q(z)(\hat{F}(Q(z))-z)\,dK_{h}(u-z)

\displaystyle=-q(u)\frac{\mathbb{G}_{n}(u)}{\sqrt{nh}}+O_{a.s.}\left(\frac{h\log h^{-1}}{\sqrt{nh}}\right).

(52)

Second term in (47).

This term can be upper bounded as follows,

	$\displaystyle\sup_{u}\left\|\int_{0}^{1}R_{n}^{BK}(z)\,dK_{h}(u-z)\right\|\leq\sup_{u}\int_{0}^{1}\left\|R_{n}^{BK}(z)\right\|\,\left\|dK_{h}(u-z)\right\|\leq\sup_{z}\|R_{n}^{BK}(z)\|TV(K_{h})$		(53)
	$\displaystyle=O_{a.s.}\left(n^{-3/4}\ell(n)\right)h^{-1}TV(K)=O_{a.s.}\left(h^{-1}n^{-3/4}\ell(n)\right),$		(54)

where we used the properties of total variation in the first inequality and in the second equality.

Plugging (52) and (54) into (47) and multiplying by $\sqrt{nh}$ yields

\displaystyle\sqrt{nh}\left(\hat{q}_{h}(u)-q_{h}(u)\right)=-q(u)\mathbb{G}_{n}(u)+O_{a.s.}(h\log h^{-1})+O_{a.s.}\left(h^{-1/2}n^{-1/4}\ell(n)\right).

(55)

Note that we disregarded the term $\sqrt{nh}R_{n}^{I}(u)$ , since it has the uniform order $O_{a.s.}(n^{-1/2}h^{-1/2})$ , which is smaller than $O_{a.s.}\left(h^{-1/2}n^{-1/4}\ell(n)\right)$ . Dividing by $q(u)$ , which is bounded away from zero for $u\in[0,1]$ due to Assumption 1, finishes the proof. ∎

A.2 Proof of 1

Let us check that the conditions of Giné and Guillou (2002, Proposition 3.1) hold. Indeed, Assumption 2 implies their condition $(K_{2})$ , while Assumption 3 implies their conditions (2.11) and $(W_{2})$ . By Giné and Guillou (2002, Remark 3.5), their condition $(D_{2})$ can be replaced by the conditions satisfied by the uniform distribution. To complete the proof, divide the expansion in Theorem 1 by $\sqrt{2\log h_{n}^{-1}}$ and note that the first term $\mathbb{G}_{n}(u)/\sqrt{2\log h_{n}^{-1}}$ converges to $\left(\int_{\mathbb{R}}K^{2}(x)\,dx\right)^{1/2}$ by Giné and Guillou (2002, Proposition 3.1), while the remainder converges to zero a.s. due to Assumption 3. ∎

Appendix B Proof of Theorem 2

A key ingredient of the proof is to note that Lemmas 2.3 and 2.4 of Chernozhukov et al. (2014b) continue to hold even if their random variable $Z_{n}$ does not have the form $Z_{n}=\sup_{f\in\mathcal{F}_{n}}\mathbb{G}_{n}f$ for the standard empirical process $\mathbb{G}_{n}$ , but instead is a generic random variable admitting a strong sup-Gaussian approximation with a sufficiently small remainder.

For completeness, we provide the aforementioned trivial extensions of the two lemmas here, taken directly from Andreyanov and Franguridi (2022).

Let $X$ be a random variable with distribution $P$ taking values in a measurable space $(S,\mathcal{S})$ . Let $\mathcal{F}$ be a class of real-valued functions on $S$ . We say that a function $F:S\to\mathbb{R}$ is an envelope of $\mathcal{F}$ if $F$ is measurable and $|f(x)|\leq F(x)$ for all $f\in\mathcal{F}$ and $x\in S$ .

We impose the following assumptions (A1)-(A3) of Chernozhukov et al. (2014b).

(A1)

The class $\mathcal{F}$ is pointwise measurable, i.e. it contains a coutable subset $\mathbb{G}$ such that for every $f\in\mathcal{F}$ there exists a sequence $g_{m}\in\mathbb{G}$ with $g_{m}(x)\to f(x)$ for every $x\in S$ .
(A2)

For some $q\geq 2$ , an envelope $F$ of $\mathcal{F}$ satisfies $F\in L^{q}(P)$ .
(A3)

The class $\mathcal{F}$ is $P$ -pre-Gaussian, i.e. there exists a tight Gaussian random variable $G_{P}$ in $l^{\infty}(\mathcal{F})$ with mean zero and covariance function

$\displaystyle\mathbf{E}[G_{P}(f)G_{P}(g)]=\mathbf{E}[f(X)g(X)]\text{ for all }f,g\in\mathcal{F}.$ (56)

Lemma 3 (A trivial extension of Lemma 2.3 of Chernozhukov et al. (2014b)).

Suppose that Assumptions (A1)-(A3) are satisfied and that there exist constants $\underline{\sigma}$ , $\bar{\sigma}>0$ such that $\underline{\sigma}^{2}\leq Pf^{2}\leq\bar{\sigma}^{2}$ for all $f\in\mathcal{F}$ . Moreover, suppose there exist constants $r_{1},r_{2}>0$ and a random variable $\tilde{Z}=\sup_{f\in\mathcal{F}}G_{P}f$ such that $\mathbb{P}(|Z-\tilde{Z}|>r_{1})\leq r_{2}$ . Then

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(Z\leq t)-\mathbb{P}(\tilde{Z}\leq t)\right|\leq C_{\sigma}r_{1}\left\{\mathbf{E}\tilde{Z}+\sqrt{1\vee\log(\underline{\sigma}/r_{1})}\right\}+r_{2},

(57)

where $C_{\sigma}$ is a constant depending only on $\underline{\sigma}$ and $\bar{\sigma}$ .

Proof.

For every $t\in\mathbb{R}$ , we have

$\displaystyle\mathbb{P}(Z\leq t)$	$\displaystyle=\mathbb{P}(\{Z\leq t\}\cap\{\|Z-\tilde{Z}\|\leq r_{1}\})+\mathbb{P}(\{Z\leq t\}\cap\{\|Z-\tilde{Z}\|>r_{1}\})$	(58)
	$\displaystyle\leq\mathbb{P}(\tilde{Z}\leq t+r_{1})+r_{2}$	(59)
	$\displaystyle\leq\mathbb{P}(\tilde{Z}\leq t)+C_{\sigma}r_{1}\left\{\mathbf{E}\tilde{Z}+\sqrt{1\vee\log(\underline{\sigma}/r_{1})}\right\}+r_{2},$	(60)

where Lemma A.1 of Chernozhukov et al. (2014b) (an anti-concentration inequality for $\tilde{Z}$ ) is used to deduce the last inequality. A similar argument leads to the reverse inequality, which completes the proof. ∎

Lemma 4 (A trivial extension of Lemma 2.4 of Chernozhukov et al. (2014b)).

Suppose that there exists a sequence of $P$ -centered classes $\mathcal{F}_{n}$ of measurable functions $S\to\mathbb{R}$ satisfying assumptions (A1)-(A3) with $\mathcal{F}=\mathcal{F}_{n}$ for each $n$ , where in the assumption (A3) the constants $\underline{\sigma}$ and $\bar{\sigma}$ do not depend on $n$ . Denote by $B_{n}$ the Brownian bridge on $\ell^{\infty}(\mathcal{F}_{n})$ , i.e. a tight Gaussian random variable in $\ell^{\infty}(\mathcal{F}_{n})$ with mean zero and covariance function

\displaystyle\mathbf{E}[B_{n}(f)B_{n}(g)]=\mathbf{E}[f(X)g(X)]\text{ for all }f,g\in\mathcal{F}_{n}.

(61)

Moreover, suppose that there exists a sequence of random variables $\tilde{Z}_{n}=\sup_{f\in\mathcal{F}_{n}}B_{n}(f)$ and a sequence of constants $r_{n}\to 0$ such that $|Z_{n}-\tilde{Z}_{n}|=O_{P}(r_{n})$ and $r_{n}\mathbf{E}\tilde{Z}_{n}\to 0$ . Then

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(Z_{n}\leq t)-\mathbb{P}(\tilde{Z}_{n}\leq t)\right|\to 0.

(62)

Proof.

Take $\beta_{n}\to\infty$ sufficiently slowly such that $\beta_{n}r_{n}(1\vee\mathbf{E}\tilde{Z}_{n})=o(1)$ . Then since $\mathbb{P}(|Z_{n}-\tilde{Z}_{n}|>\beta_{n}r_{n})=o(1)$ , by Lemma 3, we have

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(Z_{n}\leq t)-\mathbb{P}(\tilde{Z}_{n}\leq t)\right|=O\left(r_{n}(\mathbf{E}\tilde{Z}_{n}+|\log(\beta_{n}r_{n})|)\right)+o(1)=o(1).

(63)

This completes the proof. ∎

I now go back to the proof of Theorem 2. Chernozhukov et al. (2014b, Proposition 3.1) establish a sup-Gaussian approximation of $W_{n}^{\mathbb{G}}$ ; namely, there exists a tight centered Gaussian random variable $B_{n}$ in $\ell^{\infty}([0,1])$ with the covariance function

\displaystyle\mathbf{E}[B_{n}(u)B_{n}(v)]=\text{Cov}\left(K_{h}(U-u),K_{h}(U-v)\right),\quad u,v\in[0,1],

(64)

where $U\sim\text{Uniform}[0,1]$ , such that, for $\tilde{W}_{n}\coloneqq\sup_{u\in[0,1]}B_{n}(u)$ , we have the approximation

\displaystyle W_{n}^{\mathbb{G}}=\tilde{W}_{n}+O_{p}\left((nh)^{-1/6}\log n\right).

(65)

Lemma 4 and Chernozhukov et al. (2014b, Remark 3.2) then imply

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(W_{n}^{\mathbb{G}}\leq t)-\mathbb{P}(\tilde{W}_{n}\leq t)\right|\to 0.

(66)

On the other hand, from Theorem 1 it follows that

\displaystyle W_{n}^{bc}=W_{n}^{\mathbb{G}}+O_{a.s.}\left(n^{1/2}h^{3/2}+h\log h^{-1}+h^{-1/2}n^{-1/4}\ell(n)\right),

(67)

where we define $\ell(n)\coloneqq(\log n)^{1/2}(\log\log n)^{1/4}$ . Substituting (65) into (67) yields

\displaystyle W_{n}^{bc}=\tilde{W}_{n}+O_{a.s.}\left((nh)^{-1/6}\log n+n^{1/2}h^{3/2}+h\log h^{-1}+h^{-1/2}n^{-1/4}\ell(n)\right).

(68)

Assumption 4 implies that $n^{1/2}h^{3/2}=o(\log^{-1/2}(n))$ and $h^{-1/2}n^{-1/4}\ell(n)=o(\log^{-1/2}(n))$ . Therefore,

\displaystyle W_{n}^{bc}-\tilde{W}_{n}=o_{p}(\log^{-1/2}n).

(69)

It now follows from Chernozhukov et al. (2014b, Remark 3.2) that

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(W_{n}^{bc}\leq t)-\mathbb{P}(\tilde{W}_{n}\leq t)\right|\to 0.

(70)

Applying the triangle inequality to equations (66) and (70) yields

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(W_{n}^{bc}\leq t)-\mathbb{P}(W_{n}^{\mathbb{G}}\leq t)\right|\to 0.

(71)

On the other hand, considering the sample $U_{i}=F(X_{i})\sim\text{iid Uniform}[0,1]$ , we have

\displaystyle W_{n}^{bc,U[0,1]}=W_{n}^{\mathbb{G}}+O_{a.s.}\left((nh)^{-1/6}\log n+n^{1/2}h^{3/2}+h\log h^{-1}+h^{-1/2}n^{-1/4}\ell(n)\right).

(72)

A similar argument yields

\displaystyle\sup_{t\in\mathbb{R}}\left|\mathbb{P}(W_{n}^{bc}\leq t)-\mathbb{P}(W_{n}^{bc,U[0,1]}\leq t)\right|\to 0,

(73)

which completes the proof. ∎

	$\displaystyle\left\|R_{n}^{II}(u)\right\|$	$\displaystyle=\left\|\int_{0}^{1}(q(z)-q(u))(\hat{F}(Q(z))-z)\,dK_{h}(u-z)\right\|$		(49)
		$\displaystyle\leq Mh\left\|\int_{0}^{1}(\hat{F}(Q(z))-z)\,dK_{h}(u-z)\right\|=Mh\left\|\mathbb{G}_{n}(u)/\sqrt{nh}\right\|,$		(50)

Bias correction and uniform inference for the quantile density function

Abstract

1 Introduction

2 Setup and kernel quantile density estimator

Assumption 1 (Data generating process).

Assumption 2 (Kernel function).

Assumption 3 (Bandwidth, estimation).

Assumption 4 (Bandwidth, inference).

3 Bias correction and Bahadur-Kiefer expansion

Theorem 1 (Bahadur-Kiefer expansion for the BC-KQD).

Corollary 1 (Strong uniform consistency of BC-KQD).

4 Uniform confidence bands

Theorem 2 (Exactness of confidence bands).

5 Monte Carlo study

6 Conclusion

References

Appendix

Appendix A Proof of Theorem 1 and 1

A.1 Proof of the representation (33)

Lemma 1.

Proof.

Lemma 2.

Proof.

A.2 Proof of 1

Appendix B Proof of Theorem 2

Lemma 3 (A trivial extension of Lemma 2.3 of Chernozhukov et al. (2014b)).

Proof.

Lemma 4 (A trivial extension of Lemma 2.4 of Chernozhukov et al. (2014b)).

Proof.

Bias correction and uniform inference
for the quantile density function