This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Matthew Norton 22institutetext: Naval Postgraduate School, Operations Research Department
22email: mnorton@nps.edu
33institutetext: V. Khokhlov 44institutetext: 44email: vkhokhlov.embals2016@london.edu 55institutetext: S. Uryasev 66institutetext: University of Florida, Department of Industrial and Systems Engineering, Risk Management and Financial Engineering Laboratory
66email: uryasev@ufl.edu

Calculating CVaR and bPOE for Common Probability Distributions With Application to Portfolio Optimization and Density Estimation

Matthew Norton     Valentyn Khokhlov     Stan Uryasev
Abstract

Conditional Value-at-Risk (CVaR) and Value-at-Risk (VaR), also called the superquantile and quantile, are frequently used to characterize the tails of probability distribution’s and are popular measures of risk in applications where the distribution represents the magnitude of a potential loss. Buffered Probability of Exceedance (bPOE) is a recently introduced characterization of the tail which is the inverse of CVaR, much like the CDF is the inverse of the quantile. These quantities can prove very useful as the basis for a variety of risk-averse parametric engineering approaches. Their use, however, is often made difficult by the lack of well-known closed-form equations for calculating these quantities for commonly used probability distribution’s. In this paper, we derive formulas for the superquantile and bPOE for a variety of common univariate probability distribution’s. Besides providing a useful collection within a single reference, we use these formulas to incorporate the superquantile and bPOE into parametric procedures. In particular, we consider two: portfolio optimization and density estimation. First, when portfolio returns are assumed to follow particular distribution families, we show that finding the optimal portfolio via minimization of bPOE has advantages over superquantile minimization. We show that, given a fixed threshold, a single portfolio is the minimal bPOE portfolio for an entire class of distribution’s simultaneously. Second, we apply our formulas to parametric density estimation and propose the method of superquantile’s (MOS), a simple variation of the method of moment’s (MM) where moment’s are replaced by superquantile’s at different confidence levels. With the freedom to select various combinations of confidence levels, MOS allows the user to focus the fitting procedure on different portions of the distribution, such as the tail when fitting heavy-tailed asymmetric data.

Keywords:
Conditional Value-at-Risk Buffered Probability of Exceedance Superquantile Density Estimation Portfolio Optimization

1 Introduction

When faced with randomness and uncertainty, some of the most popular techniques for dealing with such randomness are parametric in nature. Given a real valued random variable XX, analysis can be greatly simplified if one assumes that XX belongs to a specific parametric family of distribution’s. For example, the Method of Moment’s (MM) is one of the simplest and most widely used methods for parametric density estimation. These techniques, however, often require that certain characteristics of the distribution family be representable by a simple, ideally closed-form, expression. For example, traditional MM uses closed-form expression’s for the moments of the parametric distribution family. Similarly, the Matching of Quantile’s (MOQ) procedure (see e.g., Sgouropoulos et al. (2015); Karian and Dudewicz (1999)) uses expression’s for the quantile function. In portfolio optimization, the availability of simple expression’s for the mean and variance of portfolio returns yields a tractable Markowitz portfolio optimization problem.111See Section 4 for specifics. For a variety of problem’s, application of a parametric method relies upon the availability of a closed-form expression for a specific characteristic of the parametric family of interest.

Luckily, for a variety of distribution’s, closed-from expression’s are available for commonly utilized characteristic’s. These include characteristic’s such as the moment’s, the quantile, and the CDF. Over the past two decades, however, new fundamental characteristic’s like the superquantile have emerged from the field of quantitative risk management with important applications across engineering fields like financial, civil, and environmental engineering. (see e.g., Rockafellar and Royset (2010); Rockafellar and Uryasev (2000); Davis and Uryasev (2016)). Furthermore, closed-form expressions for these characteristics, for a large variety of common parametric distribution families, have not been widely disseminated. While emerging from specific engineering applications, some of these characteristics are very general and can be viewed as fundamental aspects of a random variable just like the mean or quantile. Thus, utilization of these characteristics within parametric methods is a natural consideration. To facilitate their use, however, we must develop closed-form expressions.222When closed-form expressions are not available, we look to provide simple calculation methods that might still be utilized within parametric methods.

We focus on developing these expressions for the superquantile and Buffered Probability of Exceedance (bPOE) for a variety of distribution families. Developments in financial risk theory over the last two decades have heavily emphasized measurement of tail risk. After Artzner et al. (1999) introduced the concept of a coherent risk measure, Rockafellar and Uryasev (2000) introduced the superquantile, also called Conditional Value at Risk (CVaR) in the financial literature. This began to be considered a preferable characterization of tail risk compared to the quantile, or Value-at-Risk (VaR). While some closed-form expression are available to use the superquantile within parametric procedures, see e.g., Rockafellar and Uryasev (2000); Landsman and Valdez (2003); Andreev et al. (2005), the variety of distribution’s discussed within each of these sources is limited.

We illustrate that for a variety of common distribution’s, straightforward techniques such as integration of the quantile function obtain a closed-form expression for the superquantile that is easy to use within subsequent parametric methods. We attempt to include a variety, providing superquantile formulas for the Exponential, Pareto/Generalized Pareto (GPD), Laplace, Normal, LogNormal, Logistic, LogLogistic, Generalized Student-t, Weibull, and Generalized Extreme Value (GEV) distribution’s. These provide examples varying from the exponentially tailed (Exponential, Pareto/GPD, Laplace), to the symmetric (Normal, Laplace, Logistic, Student-t), to the asymmetric heavier tailed (Weibull, LogLogistic, GEV) distribution’s. While some of these formulas may exist elsewhere, we hope that this paper serves as a good resource for practitioners in search of superquantile formulas.

While the superquantile has risen in popularity over the past decade, a related characteristic called Buffered Probability of Exceedance (bPOE) has recently been introduced, first by Rockafellar and Royset (2010) in the context of Buffered Failure Probability and then generalized by Mafusalov and Uryasev (2018). This concept has grown in popularity within the risk management community with application in finance, logistics, analysis of natural disasters, statistics, stochastic programming, and machine learning (Shang et al. (2018); Uryasev (2014); Davis and Uryasev (2016); Mafusalov et al. (2018); Norton et al. (2017); Norton and Uryasev (2016). Specifically, bPOE is the inverse of the superquantile in the same way that the CDF is the inverse of the quantile. However, much like the superquantile when compared against the quantile, bPOE has many mathematically advantageous properties over the traditionally used Probability of Exceedance (POE). Direct optimization often reduces to convex or linear programming, it can be calculated via a one dimensional convex optimization problem, and it provides a risk-averse probabilistic assessment of the risk of experiencing outcomes larger than some fixed upper threshold. Thus, the second aim of this paper is to provide closed-form expressions for bPOE and, when unable to do so, show that calculation of bPOE is still simple, reducing to a one-dimensional convex optimization problem or a one-dimensional root finding problem. For the parametric portfolio application, in particular, we will see that when closed-form bPOE is unavailable and the superquantile is available, finding the optimal bPOE portfolio is no more difficult, computationally, than finding the optimal superquantile (CVaR) portfolio.

Motivating us to derive closed-form expressions (or simple calculation formulas) for the superquantile and bPOE for common distribution’s is the inclusion of these risk averse, tail measurements within parametric methods. In particular, we explore the use of the superquantile and bPOE within parametric portfolio optimization and density estimation. First, we consider parametric portfolio optimization, where returns are assumed to follow a specific distribution and, using these assumptions, a tractable portfolio optimization problem is formulated and solved. We begin by narrowing our choices of distribution to only those that both fit the pattern of portfolio returns and generate tractable portfolio optimization problems. Then, we consider two companion problems, solving for portfolio’s that minimize the superquantile (CVaR) of the distribution of potential losses (i.e. the average of the worst-case 100(1α)%100(1-\alpha)\% scenarios) and portfolio’s that minimize bPOE of the loss distribution (i.e. the buffered probability that losses will exceed a fixed upper threshold xx). In comparing these problems, we discover that bPOE optimization can often be highly preferable to superquantile (CVaR) optimization in the parametric context. Specifically, for fixed α\alpha, the portfolio that minimizes the superquantile depends upon the distributional assumption (i.e., even if α\alpha is fixed, changing the assumed parametric distribution for returns will change the contents of the optimal portfolio). However, for fixed threshold xx, the portfolio that minimizes bPOE does not depend upon the distributional assumption (at least for the specific class of distribution’s we consider, which includes the Logistic, Laplace, Normal, Student-t, and GEV). In other words, no matter which of these distribution’s we choose, we will always achieve the same optimal portfolio for fixed value of threshold xx. Thus, bPOE-based portfolio optimization can provide additional consistency with respect to parameter choices, eliminating one source of additional variability for the decision maker.

Finally, we consider parametric density estimation, proposing a variant of MM where moments are replaced by superquantile’s. This can also be seen as a natural variation of the MOQ procedure where quantiles are replaced by superquantile’s. Made possible by the closed-form superquantile expressions, we show that this framework allows one to flexibly perform density estimation, allowing the user to focus the fitting procedure on specific portions of the distribution. For example, we illustrate by fitting a Weibull with additional emphasis put onto estimating the right tail. Compared against traditional MM and maximum likelihood (ML), we see that we get a better fit for such asymmetric, heavy tailed situations.

1.1 Organization of Paper

We first provide a brief introduction to superquantile’s and bPOE in Section 1.2. In Section 2, we give formulas for both the superquantile and bPOE for the Exponential, Pareto, Generalized Pareto, and Laplace distribution’s. Along the way, we highlight some simple relationships between POE, bPOE, the quantile, and the superquantile. In Section 3, we treat distribution’s for which a closed-form superquantile formula exists, but where we are unable to derive a simple closed-form bPOE formula. In order of appearance, we consider the Normal, LogNormal, Logistic, Generalized Student-t, Weibull, LogLogistic, and Generalized Extreme Value Distribution. However, we point out for these cases that because a formula for the superquantile is known, bPOE can be solved for via a simple root finding problem. We also illustrate for some cases that the one-dimensional convex optimization formula for bPOE can also be used in these cases. In Section 4, we illustrate the use of these formulas in portfolio optimization and parametric distribution approximation.

1.2 Background and Notation

When working with optimization of tail probabilities, one frequently works with constraints or objectives involving probability of exceedance (POE), px(X)=P(X>x)p_{x}(X)=P(X>x), or its associated quantile qα(X)=min{x|P(Xx)α}q_{\alpha}(X)=\min\{x|P(X\leq x)\geq\alpha\}, where α[0,1]\alpha\in[0,1] is a probability level. The quantile is a popular measure of tail risk in financial engineering, but when included in optimization problems via constraints or objectives, is quite difficult to treat with continuous (linear or non-linear) optimization techniques.

A significant advancement was made in Rockafellar and Uryasev (2000, 2002) in the development of a replacement called the superquantile or CVaR. The superquantile is a measure of uncertainty similar to the quantile, but with superior mathematical properties. Formally, the superquantile (CVaR) for a continuously distributed XX is defined as,

q¯α(X)=E[X|X>qα(X)]=11αqα(X)xf(x)𝑑x=11αα1qp(X)𝑑p.\bar{q}_{\alpha}(X)=E\left[X|X>q_{\alpha}(X)\right]=\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}xf(x)dx=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp.

Similar to qα(X)q_{\alpha}(X), the superquantile can be used to assess the tail of the distribution. The superquantile, though, is far easier to handle in optimization contexts. It also has the important property that it considers the magnitude of events within the tail. Therefore, in situations where a distribution may have a heavy tail, the superquantile accounts for magnitudes of low-probability large-loss tail events while the quantile does not account for this information.

The notion of buffered probability was originally introduced by Rockafellar and Royset (2010) in the context of the design and optimization of structures as the Buffered Probability of Failure (bPOF). Working to extend this concept, bPOE was developed as the inverse of the superquantile by Mafusalov and Uryasev (2018) in the same way that POE is the inverse of the quantile. Specifically, for continuously distributed XX, bPOE at threshold xx is defined in the following way, where supX\sup X denotes the essential supremum of random variable XX and threshold x[E[X],supX]x\in[E[X],\sup X].

p¯x(X)={1α|q¯α(X)=x}.\bar{p}_{x}(X)=\{1-\alpha|\bar{q}_{\alpha}(X)=x\}\;.

In words, bPOE calculates one minus the probability level at which the superquantile, the tail expectation, equals the threshold xx. Roughly speaking, bPOE calculates the proportion of worst-case outcomes which average to xx. Figure 1 presents an illustration of bPOE for a Lognormal distributed random variable XX. We note that there exist two slightly different variants of bPOE, called Upper and Lower bPOE which are identical for continuous random variables. For this paper, we utilize only continuous random variables. For the interested reader, details regarding the difference between Upper and Lower bPOE can be found in Mafusalov and Uryasev (2018).

Refer to caption
Figure 1: Shown is the Probability Density Function (PDF) of XLognormal(σ=1,μ=0)X\sim Lognormal(\sigma=1,\mu=0). Given threshold zz\in\mathbb{R}, POE equals P(X>z)P(X>z) the cumulative density in red. For the same threshold zz, bPOE equals p¯z(X)\bar{p}_{z}(X) the combined cumulative density in red and blue. By definition, the expectation of the worst-case 1α=p¯z(X)1-\alpha=\bar{p}_{z}(X) outcomes equals z=q¯α(X)z=\bar{q}_{\alpha}(X). These worst-case outcomes are those that are larger than the quantile qα(X)q_{\alpha}(X).

Similar to the superquantile, bPOE is a more robust measure of tail risk, as it considers not only the probability that events/losses will exceed the threshold xx, but also the magnitude of these potential events. Also, much like the superquantile, bPOE can be represented as the unique minimum of a one-dimensional convex optimization problem with the formulas given by Norton and Uryasev (2016); Mafusalov and Uryasev (2018) as follows, where []+=max{,0}[\cdot]^{+}=\max\{\cdot,0\}.

p¯x(X)\displaystyle\bar{p}_{x}(X) =mina0E[a(Xx)+1]+=minγ<xE[Xγ]+xγ,\displaystyle=\min_{a\geq 0}E[a(X-x)+1]^{+}=\min_{\gamma<x}\frac{E[X-\gamma]^{+}}{x-\gamma},
q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =min 𝛾γ+E[Xγ]+1α.\displaystyle=\underset{\gamma}{\text{min }}\gamma+\frac{E[X-\gamma]^{+}}{1-\alpha}\;.

Note that these formulas are valid for general real valued random variables, not only continuously distributed random variables. It is also useful to note that the argmin of both the bPOE and superquantile optimization formulas gives the quantile. For the bPOE calculation formula, we have that the argmin is γ=qα(X)\gamma^{*}=q_{\alpha}(X) where α=1p¯x(X)\alpha=1-\bar{p}_{x}(X) and a=1xγa^{*}=\frac{1}{x-\gamma^{*}} for the other representation. For the superquantile calculation formula, we have that the argmin is γ=qα(X)\gamma^{*}=q_{\alpha}(X) where α\alpha was the desired probability level for calculating the superquantile.

The bPOE concept is also closely related to the concept of a superdistribution function F¯(x)\bar{F}(x), introduced by Rockafellar and Royset (2014). For the CDF, we have that POE equals P(X>x)=1F(x)P(X>x)=1-F(x) and we have the inverse CDF given by F1(α)=qα(X).F^{-1}(\alpha)=q_{\alpha}(X). The superdistribution function F¯(x)\bar{F}(x) is motivated by the inverse relation F¯1(α)=q¯α(X).\bar{F}^{-1}(\alpha)=\bar{q}_{\alpha}(X). Thus, bPOE equals 1F¯(x)1-\bar{F}(x). The superdistribution function of a random variable XX can also be understood as the CDF of an auxiliary random variable X¯=q¯u(X)\bar{X}=\bar{q}_{u}(X) where uU(0,1)u\sim U(0,1) is a uniformly distributed random variable. In this case, F¯X(x)=FX¯(x)\bar{F}_{X}(x)=F_{\bar{X}}(x) where the subscript indicates that it is the distribution function associated with a particular random variable.

2 Distributions With Closed Form Superquantile and bPOE

In this section, we derive closed-form expressions for both the superquantile and bPOE for the Exponential, Pareto, Generalized Pareto, and Laplace distribution’s. For these distribution’s, we see that they exhibit a reproducing type of property, where the formula for POE is identical to bPOE up to a constant. The Laplace distribution presents an interesting case in which only the right tail exhibits this reproducing property. Along the way, for completeness, we also highlight relationships between the expressions for bPOE, POE, the superquantile, and the quantile.

2.1 Exponential

For this section, we have exponential random variable XExp(λ)X\sim Exp(\lambda). Recall that the Exponential parameter has range λ>0\lambda>0 with E[X]=σ(X)=1λE[X]=\sigma(X)=\frac{1}{\lambda}, and that the Exponential CDF, PDF, and quantile are given by,

F(x)={1eλxx0,0x<0.,f(x)={λeλxx0,0x<0.,qα(X)=ln(1α)λF(x)=\begin{cases}1-e^{-\lambda x}&x\geq 0,\\ 0&x<0.\end{cases},\quad f(x)=\begin{cases}\lambda e^{-\lambda x}&x\geq 0,\\ 0&x<0.\end{cases},\quad q_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda}
Proposition 1

Let XExp(λ)X\sim Exp(\lambda). Then,

.q¯α(X)=ln(1α)+1λ,p¯x(X)=e1λx..\bar{q}_{\alpha}(X)=\frac{-\ln(1-\alpha)+1}{\lambda},\quad\bar{p}_{x}(X)=e^{1-\lambda x}.
Proof

First, note that qα(X)=ln(1α)λq_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda} for exponential RV’s with rate parameter λ\lambda. We then have,

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=1λ(1α)α1ln(1p)𝑑p\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{\alpha}^{1}\ln(1-p)dp
=1λ(1α)1α0ln(y)dy=1λ(1α)01αln(y)𝑑y\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{1-\alpha}^{0}-\ln(y)dy=\frac{-1}{\lambda(1-\alpha)}\int_{0}^{1-\alpha}\ln(y)dy

Since ln(y)𝑑y=yln(y)y+C\int\ln(y)dy=y\ln(y)-y+C, we have,

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =1λ(1α)01αln(y)𝑑y\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{0}^{1-\alpha}\ln(y)dy
=1λ(1α)[(1α)ln(1α)(1α)]=ln(1α)+1λ\displaystyle=\frac{-1}{\lambda(1-\alpha)}\left[(1-\alpha)\ln(1-\alpha)-(1-\alpha)\right]=\frac{-\ln(1-\alpha)+1}{\lambda}

We can then see that,

p¯x(X)\displaystyle\bar{p}_{x}(X) ={1α|q¯α(X)=x}\displaystyle=\{1-\alpha|\bar{q}_{\alpha}(X)=x\}
={1α|ln(1α)+1λ=x}\displaystyle=\{1-\alpha|\frac{-\ln(1-\alpha)+1}{\lambda}=x\}
={1α|ln(1α)=1λx}\displaystyle=\{1-\alpha|\ln(1-\alpha)=1-\lambda x\}
={1α|eln(1α)=e1λx}={1α|1α=e1λx}=e1λx\displaystyle=\{1-\alpha|e^{\ln(1-\alpha)}=e^{1-\lambda x}\}=\{1-\alpha|1-\alpha=e^{1-\lambda x}\}=e^{1-\lambda x}

Next, we relate bPOE and POE as well as the superquantile and quantile.

Corollary 1

Let XExp(λ)X\sim Exp(\lambda), with mean μ=1λ\mu=\frac{1}{\lambda}. Then, p¯x(X)=P(X>xμ)\bar{p}_{x}(X)=P(X>x-\mu) and q¯α(X)=qα(X)+μ\bar{q}_{\alpha}(X)=q_{\alpha}(X)+\mu.

Proof

We know that XX, being exponential, has CDF given by P(Xx)=1eλxP(X\geq x)=1-e^{-\lambda x}. From Proposition 1, we know that

p¯x(X)=e(1λx)=eλ(1λ+x).\bar{p}_{x}(X)=e^{(1-\lambda x)}=e^{-\lambda(\frac{-1}{\lambda}+x)}\;.

Then, since μ=1λ\mu=\frac{1}{\lambda}, it follows that p¯x(X)=eλ(xμ)=1P(Xxμ)=P(X>xμ)\bar{p}_{x}(X)=e^{-\lambda(x-\mu)}=1-P(X\leq x-\mu)=P(X>x-\mu). The equality for CVaR follows easily from Proposition 1 since qα(X)=ln(1α)λq_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda}. ∎

2.2 Pareto

Assume XPareto(a,xm)X\sim Pareto(a,x_{m}). Recall that Pareto parameters have range a>0,xm>0a>0,x_{m}>0 with E[X]={,a1,axma1,a>1E[X]=\begin{cases}\infty,&a\leq 1,\\ \frac{ax_{m}}{a-1},&a>1\end{cases} and σ2(X)={,a(0,2],axm2(a1)2(a2),a>2\sigma^{2}(X)=\begin{cases}\infty,&a\in(0,2],\\ \frac{ax_{m}^{2}}{(a-1)^{2}(a-2)},&a>2\end{cases}, and that the Pareto CDF, PDF, and quantile are given by,

F(x)={1(xmx)axxm,0x<xm.,f(x)={axmaxa+1xxm,0x<xm.,qα(X)=xm(1α)1aF(x)=\begin{cases}1-\left(\frac{x_{m}}{x}\right)^{a}&x\geq x_{m},\\ 0&x<x_{m}.\end{cases}\quad,\;f(x)=\begin{cases}\frac{ax_{m}^{a}}{x^{a+1}}&x\geq x_{m},\\ 0&x<x_{m}.\end{cases}\quad,\;q_{\alpha}(X)=\frac{x_{m}}{(1-\alpha)^{\frac{1}{a}}}
Proposition 2

Assume XPareto(a,xm)X\sim Pareto(a,x_{m}) with a>1a>1. Then, for α[0,1]\alpha\in[0,1] and xE[X]x\geq E[X],

q¯α(X)=xma(1α)1a(a1),p¯x(X)=(xmax(a1))a.\bar{q}_{\alpha}(X)=\frac{x_{m}a}{(1-\alpha)^{\frac{1}{a}}(a-1)}\;,\quad\bar{p}_{x}(X)=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}\;.

Note that if a(0,1]a\in(0,1], then E[X]=E[X]=\infty implying that q¯α(X)=\bar{q}_{\alpha}(X)=\infty and p¯x(X)=1\bar{p}_{x}(X)=1 for all α[0,1]\alpha\in[0,1] and xnx\in\mathbb{R}^{n}.

Proof

First, note that the conditional distribution of a pareto, conditioned on the event that the random value is larger than some γ\gamma, is simply another pareto with parameters a,γa,\gamma. This implies that E[X|X>γ]=aγa1E[X|X>\gamma]=\frac{a\gamma}{a-1} if a1a\geq 1 otherwise the expectation is \infty. Also, 1F(γ)=(xmγ)a1-F(\gamma)=\left(\frac{x_{m}}{\gamma}\right)^{a}. Since,

E[Xγ]+=(E[X|X>γ]γ)(1F(γ)),E[X-\gamma]^{+}=(E[X|X>\gamma]-\gamma)(1-F(\gamma))\;,

we will have that,

E[Xγ]+=(aγa1γ)(xmγ)a.E[X-\gamma]^{+}=(\frac{a\gamma}{a-1}-\gamma)\left(\frac{x_{m}}{\gamma}\right)^{a}\;.

This gives us bPOE formula,

p¯x(X)\displaystyle\bar{p}_{x}(X) =minxmγ<x(aγa1γ)xmaγa(xγ)\displaystyle=\min_{x_{m}\leq\gamma<x}\frac{\left(\frac{a\gamma}{a-1}-\gamma\right)x_{m}^{a}}{\gamma^{a}(x-\gamma)}
=minxmγ<x(aa11)xmaγa1(xγ)=(maxxmγ<xγa1(xγ)(a1)xma)1\displaystyle=\min_{x_{m}\leq\gamma<x}\frac{\left(\frac{a}{a-1}-1\right)x_{m}^{a}}{\gamma^{a-1}(x-\gamma)}=\left(\max_{x_{m}\leq\gamma<x}\frac{\gamma^{a-1}(x-\gamma)(a-1)}{x_{m}^{a}}\right)^{-1}

Since a>1a>1, the maximization objective is concave over the range γ(0,)\gamma\in(0,\infty) which contains the range (xm,x)(x_{m},x), so we just need to take the gradient of function g(γ)=γa1(xγ)(a1)xmag(\gamma)=\frac{\gamma^{a-1}(x-\gamma)(a-1)}{x_{m}^{a}} and set it to zero to find the optimal γ\gamma as follows:

gγ=x(a1)2γa2(a1)aγa1xma=0\displaystyle\frac{\partial g}{\partial\gamma}=\frac{x(a-1)^{2}\gamma^{a-2}-(a-1)a\gamma^{a-1}}{x_{m}^{a}}=0 x(a1)2γa2=(a1)aγa1\displaystyle\implies x(a-1)^{2}\gamma^{a-2}=(a-1)a\gamma^{a-1}
x(a1)a=γ\displaystyle\implies\frac{x(a-1)}{a}=\gamma

Plugging this value of γ\gamma into the objective of our bPOE formula yields,

p¯x(X)\displaystyle\bar{p}_{x}(X) =(ax(a1)aa1x(a1)a)xma(x(a1)a)a(xx(a1)a)\displaystyle=\frac{\left(\frac{\frac{ax(a-1)}{a}}{a-1}-\frac{x(a-1)}{a}\right)x_{m}^{a}}{\left(\frac{x(a-1)}{a}\right)^{a}\left(x-\frac{x(a-1)}{a}\right)}
=(xmax(a1))a\displaystyle=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}

CVaR is then equal to the value of xx which solves the equation 1α=p¯x(X)1-\alpha=\bar{p}_{x}(X) or,

1α=(xmax(a1))a,1-\alpha=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}\;,

which has solution,

q¯α(X)=xma(1α)1a(a1).\bar{q}_{\alpha}(X)=\frac{x_{m}a}{(1-\alpha)^{\frac{1}{a}}(a-1)}\;.

Corollary 2

Relating bPOE and POE, as well as the quantile and superquantile, we can say that,

p¯x(X)=P(X>x(a1)a)=P(X>x)(aa1)a and q¯α(X)=qα(X)aa1.\bar{p}_{x}(X)=P(X>\frac{x(a-1)}{a})=P(X>x)\left(\frac{a}{a-1}\right)^{a}\\ \text{ and }\bar{q}_{\alpha}(X)=q_{\alpha}(X)\frac{a}{a-1}\;.
Proof

Follows from Proposition 1 and the known formulas for POE and the quantile. ∎

2.3 Generalized Pareto Distribution (GPD)

Assume XGPD(μ,s,ξ)X\sim GPD(\mu,s,\xi). Recall that GPD parameters have range μ,s>0,ξ\mu\in\mathbb{R},s>0,\xi\in\mathbb{R} with E[X]=μ+s1ξE[X]=\mu+\frac{s}{1-\xi} if ξ<1\xi<1 and σ2(X)=s2(1ξ)2(12ξ)\sigma^{2}(X)=\frac{s^{2}}{(1-\xi)^{2}(1-2\xi)} if ξ<.5\xi<.5, and that the GPD CDF and PDF are given by,

F(x)={1(1+ξ(xμ)s)1/ξfor ξ0,1exp(xμs)for ξ=0.,f(x)=1s(1+ξ(xμ)s)(1ξ1).F(x)={\begin{cases}1-\left(1+{\frac{\xi(x-\mu)}{s}}\right)^{{-1/\xi}}&{\text{for }}\xi\neq 0,\\ 1-\exp\left(-{\frac{x-\mu}{s}}\right)&{\text{for }}\xi=0.\end{cases}}\quad,\;f(x)={\frac{1}{s}}\left(1+{\frac{\xi(x-\mu)}{s}}\right)^{{\left(-{\frac{1}{\xi}}-1\right)}}\;.

for xμx\geq\mu when ξ0\xi\geq 0 and μxμsξ\mu\leq x\leq\mu-\frac{s}{\xi} when ξ<0\xi<0. Furthermore, the quantiles are given by,

qα(X)={μ+s((1α)ξ1)ξfor ξ0,μsln(1α)for ξ=0.q_{\alpha}(X)=\begin{cases}\mu+\frac{s\left((1-\alpha)^{-\xi}-1\right)}{\xi}&{\text{for }}\xi\neq 0,\\ \mu-s\ln(1-\alpha)&{\text{for }}\xi=0.\end{cases}\;
Proposition 3

Assume XGPD(μ,s,ξ)X\sim GPD(\mu,s,\xi) with 1<ξ<1-1<\xi<1. Then,

q¯α(X)={μ+s[(1α)ξ1ξ+(1α)ξ1ξ]for ξ0,μ+s[1ln(1α)]for ξ=0.,\bar{q}_{\alpha}(X)=\begin{cases}\mu+s\left[\frac{(1-\alpha)^{-\xi}}{1-\xi}+\frac{(1-\alpha)^{-\xi}-1}{\xi}\right]&{\text{for }}\xi\neq 0,\\ \mu+s[1-\ln(1-\alpha)]&{\text{for }}\xi=0.\end{cases}\;,
p¯x(X)={(1+ξ(xμ)s)1ξ(1ξ)1ξfor ξ0,e1(xμs)for ξ=0..\bar{p}_{x}(X)=\begin{cases}\frac{\left(1+\frac{\xi(x-\mu)}{s}\right)^{-\frac{1}{\xi}}}{(1-\xi)^{\frac{1}{\xi}}}&{\text{for }}\xi\neq 0,\\ \ e^{1-\left(\frac{x-\mu}{s}\right)}&{\text{for }}\xi=0.\end{cases}\;.
Proof

For these results, we rely on the fact that if XGPD(μ,s,ξ)X\sim GPD(\mu,s,\xi), then Xγ|X>γGPD(0,s+ξ(γμ),ξ)X-\gamma|X>\gamma\sim GPD(0,s+\xi(\gamma-\mu),\xi), meaning that the excess distribution of a GPD random variable is also GPD. Now, note also that if ξ<1\xi<1, then E[X]=μ+s1ξE[X]=\mu+\frac{s}{1-\xi}. This gives us,

E[Xγ|X>γ]=E[GPD(0,s+ξ(γμ),ξ)]=s+ξ(γμ)1ξE[X-\gamma|X>\gamma]=E[GPD(0,s+\xi(\gamma-\mu),\xi)]=\frac{s+\xi(\gamma-\mu)}{1-\xi}

which further implies that,

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =E[Xqα(X)|X>qα(X)]+qα(X)\displaystyle=E[X-q_{\alpha}(X)|X>q_{\alpha}(X)]+q_{\alpha}(X)
=s+ξ(qα(X)μ)1ξ+qα(X).\displaystyle=\frac{s+\xi(q_{\alpha}(X)-\mu)}{1-\xi}+q_{\alpha}(X)\;.

Plugging in the values of the quantile functions yields the final formulas. Using the formulas we just found for q¯α(X)\bar{q}_{\alpha}(X), it is an elementary exercise to solve for p¯x(X)\bar{p}_{x}(X) which equals 1α1-\alpha such that α\alpha solves the equation x=q¯α(X)x=\bar{q}_{\alpha}(X). ∎

2.4 Laplace

Assume XLaplace(μ,b)X\sim Laplace(\mu,b). Recall that Laplace parameters have range μ\mu\in\mathbb{R}, b>0b>0 with E[X]=μE[X]=\mu and σ2(X)=2b2\sigma^{2}(X)=2b^{2}, and that the Laplace CDF, PDF, and quantile function are given by,

F(x)={112exμbxμ,12exμbx<μ.,f(x)=12be|xμ|b,F(x)=\begin{cases}1-\frac{1}{2}e^{-\frac{x-\mu}{b}}&x\geq\mu,\\ \frac{1}{2}e^{\frac{x-\mu}{b}}&x<\mu.\end{cases}\quad,\;f(x)=\frac{1}{2b}e^{\frac{-|x-\mu|}{b}}\quad,\;
qα(X)=μbsign(α0.5)ln(12|α0.5|).q_{\alpha}(X)=\mu-b\,sign(\alpha-0.5)\,\ln(1-2|\alpha-0.5|)\;.
Proposition 4

If XLaplace(μ,b)X\sim Laplace(\mu,b), then

q¯α(X)={μ+b(α1α)(1ln(2α))α<.5,μ+b(1ln(2(1α)))α.5.,\bar{q}_{\alpha}(X)=\begin{cases}\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))&\alpha<.5,\\ \mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)&\alpha\geq.5.\end{cases}\;,
p¯x(X)={12e1(xμb)xμ+b,1+z𝒲(2ez1z)x<μ+b.\bar{p}_{x}(X)=\begin{cases}\frac{1}{2}e^{1-\left(\frac{x-\mu}{b}\right)}&x\geq\mu+b,\\ 1+\frac{z}{\mathcal{W}(-2e^{-z-1}z)}&x<\mu+b.\end{cases}

where z=xμbz=\frac{x-\mu}{b} and 𝒲\mathcal{W} is the Lambert-𝒲\mathcal{W} function.333Also called the product logarithm or omega function.

Proof

To get the superquantile, we begin with the integral representation:

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=11αα1μbsign(p0.5)ln(12|p0.5|)dp\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu-b\,sign(p-0.5)\,\ln(1-2|p-0.5|)dp
=μb1αα1sign(p0.5)ln(12|p0.5|)𝑑p\displaystyle=\mu-\frac{b}{1-\alpha}\int_{\alpha}^{1}sign(p-0.5)\,\ln(1-2|p-0.5|)dp
=μb1α(min{α,.5}.5ln(2p)dp+max{α,.5}1ln(2(1p))𝑑p).\displaystyle=\mu-\frac{b}{1-\alpha}\left(\int_{\min\{\alpha,.5\}}^{.5}-\ln(2p)dp+\int_{\max\{\alpha,.5\}}^{1}\ln(2(1-p))dp\right)\;.

To evaluate the integral, we utilize simple substitution as well as the identity ln(y)𝑑y=yln(y)y+C\int\ln(y)dy=y\ln(y)-y+C. After simplifying, we see that with α<.5\alpha<.5 the integral evaluates to,

q¯α(X)=μ+b(α1α)(1ln(2α)).\bar{q}_{\alpha}(X)=\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))\;.

Similarly, we find that with α.5\alpha\geq.5 the integral evaluates to,

q¯α(X)=μ+b(1ln(2(1α))).\bar{q}_{\alpha}(X)=\mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)\;.

For bPOE, first assume that threshold xμ+bx\geq\mu+b. Using our formula for CVaR, we see that q¯.5(X)=μ+b\bar{q}_{.5}(X)=\mu+b. Thus, xμ+bx\geq\mu+b implies that 1p¯x(X).51-\bar{p}_{x}(X)\geq.5 implying that,

p¯x(X)\displaystyle\bar{p}_{x}(X) ={1α|q¯α(X)=x,α.5}\displaystyle=\{1-\alpha|\bar{q}_{\alpha}(X)=x,\alpha\geq.5\}
={1α|μ+b(1ln(2(1α)))=x}\displaystyle=\{1-\alpha|\mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)=x\}
=12e1(xμb).\displaystyle=\frac{1}{2}e^{1-\left(\frac{x-\mu}{b}\right)}\;.

Assume contrarily that x<μ+bx<\mu+b. Since q¯.5(X)=μ+b\bar{q}_{.5}(X)=\mu+b, we have that 1p¯x(X)<.51-\bar{p}_{x}(X)<.5 which implies that,

p¯x(X)\displaystyle\bar{p}_{x}(X) ={1α|q¯α(X)=x,α<.5}\displaystyle=\{1-\alpha|\bar{q}_{\alpha}(X)=x,\alpha<.5\}
={1α|μ+b(α1α)(1ln(2α))=x}.\displaystyle=\{1-\alpha|\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=x\}\;.

Letting z=xμbz=\frac{x-\mu}{b}, we must now find α\alpha which solves the equation (α1α)(1ln(2α))=z\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=z. We do so, via the following:

(α1α)(1ln(2α))=z\displaystyle\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=z zα=(ln(2α)1)1α\displaystyle\implies\frac{-z}{\alpha}=\frac{(\ln(2\alpha)-1)}{1-\alpha}
ezα=e(ln(2α)1)1α=(2αe)11α\displaystyle\implies e^{\frac{-z}{\alpha}}=e^{\frac{(\ln(2\alpha)-1)}{1-\alpha}}=\left(\frac{2\alpha}{e}\right)^{\frac{1}{1-\alpha}}
ez(1α)α=(2αe)\displaystyle\implies e^{\frac{-z(1-\alpha)}{\alpha}}=\left(\frac{2\alpha}{e}\right)
zαez(1α1)=2ze1\displaystyle\implies\frac{-z}{\alpha}e^{-z(\frac{1}{\alpha}-1)}=-2ze^{-1}
zαezα=2zez1\displaystyle\implies\frac{-z}{\alpha}e^{\frac{-z}{\alpha}}=-2ze^{-z-1}
zα=𝒲(2zez1).\displaystyle\implies\frac{-z}{\alpha}=\mathcal{W}(-2ze^{-z-1})\;.

where the final step follows from the definition of the Lambert-𝒲\mathcal{W} function which is given by the relation xex=y𝒲(y)=xxe^{x}=y\iff\mathcal{W}(y)=x. Thus, zα=𝒲(2zez1)p¯x(X)=1α=1+z𝒲(2ez1z)\frac{-z}{\alpha}=\mathcal{W}(-2ze^{-z-1})\implies\bar{p}_{x}(X)=1-\alpha=1+\frac{z}{\mathcal{W}(-2e^{-z-1}z)}. ∎

3 Distributions With Closed Form Superquantile

In this section, we derive closed-form expressions for the superquantile of the Normal, LogNormal, Logistic, Student-t, Weibull, LogLogistic, and GEV distribution’s. The Normal, Logistic, and Student-t provide us with examples of symmetric distribution’s with varying tail heaviness. The LogNormal, Weibull, LogLogistic, and GEV provide us with examples of asymmetric distribution’s that have heavy right tails. In particular, we will utilize the Weibull formula for density estimation in Section 5.

For these distribution’s, we are not able to reduce calculation of bPOE to closed-form. However, we highlight for the case of the Normal and Logistic that bPOE can be calculated by solving a one-dimensional convex optimization problem or one-dimensional root finding problem. In general, we note that for continuous XX, bPOE at xx equals 1α1-\alpha where α\alpha solves q¯α(X)=x\bar{q}_{\alpha}(X)=x. Thus, if the superquantile is known in closed-form, this reduces to a simple one-dimensional root finding problem in α\alpha.

3.1 Normal

Let X𝒩(0,1)X\sim\mathcal{N}(0,1) be a standard normal random variable. Recall that

F(x)=12[1+erf(x2)],f(x)=12πex22,qα(X)=2erf1(2α1),F(x)=\frac{1}{2}\left[1+\text{erf}\left(\frac{x}{\sqrt{2}}\right)\right],\quad f(x)=\frac{1}{\sqrt{2\pi}}e^{\frac{-x^{2}}{2}},\quad q_{\alpha}(X)=\sqrt{2}\text{erf}^{-1}(2\alpha-1)\;,

where erf()\text{erf}(\cdot) is the commonly known error function with erf1()\text{erf}^{-1}(\cdot) denoting its inverse.

We show that the superquantile can be calculated by utilizing the quantile function and PDF, which is a well known result (see e.g., Rockafellar and Uryasev (2000)). We also show that bPOE can be calculated in two ways: by solving a simple root finding problem involving only the PDF and CDF or by solving a convex optimization problem with gradients calculated via the commonly used error function. Some results are presented only for the Standard Normal 𝒩(0,1)\mathcal{N}(0,1), but can easily be applied to the non-standard case 𝒩(μ,σ)\mathcal{N}(\mu,\sigma) with appropriate shifting and scaling.

Proposition 5

If X𝒩(μ,σ)X\sim\mathcal{N}(\mu,\sigma), then

q¯α(X)=μ+σf(qα(Xμσ))1α.\bar{q}_{\alpha}(X)=\mu+\sigma\frac{f(q_{\alpha}(\frac{X-\mu}{\sigma}))}{1-\alpha}\;.
Proof

It is well known that if X𝒩(0,1)X\sim\mathcal{N}(0,1), then the conditional expectation is given by the inverse Mills Ratio, E[X|X>γ]=f(γ)1F(γ)E[X|X>\gamma]=\frac{f(\gamma)}{1-F(\gamma)}. It follows then that q¯α(X)=E[X|X>qα(X)]=f(qα(X))1F(qα(X))=f(qα(X))1α\bar{q}_{\alpha}(X)=E[X|X>q_{\alpha}(X)]=\frac{f(q_{\alpha}(X))}{1-F(q_{\alpha}(X))}=\frac{f(q_{\alpha}(X))}{1-\alpha}. ∎

Proposition 6

If X𝒩(0,1)X\sim\mathcal{N}(0,1), then

p¯x(X)=minγ<xf(γ)γ(1F(γ))xγ.\bar{p}_{x}(X)=\min_{\gamma<x}\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}\;.

Furthermore, if γargmin\gamma\in argmin, then γ\gamma equals the quantile of XX at probability level 1p¯x(X)1-\bar{p}_{x}(X).

Proof

Note that for a standard normal random variable, the tail expectation beyond any threshold γ\gamma is given by the inverse Mills Ratio,

E[X|X>γ]=f(γ)1F(γ).E[X|X>\gamma]=\frac{f(\gamma)}{1-F(\gamma)}\;.

Note also that for any threshold γ\gamma and any random variable we have,

E[Xγ]+=(E[X|X>γ]γ)(1F(γ)).E[X-\gamma]^{+}=(E[X|X>\gamma]-\gamma)(1-F(\gamma))\;.

Using the mills ratio gives us,

E[Xγ]+=(f(γ)1F(γ)γ)(1F(γ))=f(γ)γ(1F(γ)).E[X-\gamma]^{+}=(\frac{f(\gamma)}{1-F(\gamma)}-\gamma)(1-F(\gamma))=f(\gamma)-\gamma(1-F(\gamma))\;.

Plugging this result into the minimization formula for bPOE yields the final formula. ∎

Proposition 7

Let X𝒩(0,1)X\sim\mathcal{N}(0,1) with xx\in\mathbb{R} given. If γ\gamma is the solution to the equation

f(γ)1F(γ)=x\frac{f(\gamma)}{1-F(\gamma)}=x\;

then p¯x(X)=f(γ)γ(1F(γ))xγ\bar{p}_{x}(X)=\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}. Additionally, we will have that qα(X)=γq_{\alpha}(X)=\gamma and q¯α(X)=x\bar{q}_{\alpha}(X)=x at probability level α=1p¯x(X)\alpha=1-\bar{p}_{x}(X).

Proof

This follows from the fact that q¯α(X)=E[X|X>qα(X)]=f(qα(X))1F(qα(X))\bar{q}_{\alpha}(X)=E[X|X>q_{\alpha}(X)]=\frac{f(q_{\alpha}(X))}{1-F(q_{\alpha}(X))} and the optimization formula of bPOE given in the previous proposition for normally distributed variables. ∎

The following proposition provides the gradient calculation for solving the bPOE minimization problem.

Proposition 8

For X𝒩(0,1)X\sim\mathcal{N}(0,1), we have that the bPOE minimization formula has the following integral representation,

p¯x(X)\displaystyle\bar{p}_{x}(X) =minγ<xf(γ)γ(1F(γ))xγ\displaystyle=\min_{\gamma<x}\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}
=minγ<x12π0ue(γ+u)22xγ𝑑u\displaystyle=\min_{\gamma<x}\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}du
=minγ<xeγ22γπ2erfc(γ2)2π(xγ)\displaystyle=\min_{\gamma<x}\frac{e^{\frac{-\gamma^{2}}{2}}-\gamma\sqrt{\frac{\pi}{2}}\text{erfc}(\frac{\gamma}{\sqrt{2}})}{\sqrt{2\pi}(x-\gamma)}

Furthermore, the function g(u,γ;x)=12π0ue(γ+u)22xγ𝑑ug(u,\gamma;x)=\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}du is convex w.r.t. γ\gamma over the range γ(,x)\gamma\in(-\infty,x). Additionally, gg has gradient given by,

gγ\displaystyle\frac{\partial g}{\partial\gamma} =12π0γ(ue(γ+u)22xγ)𝑑u\displaystyle=\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{\partial}{\partial\gamma}\left(\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}\right)du
=eγ22xπ2erfc(γ2)2π(xγ)2\displaystyle=\frac{e^{\frac{-\gamma^{2}}{2}}-x\sqrt{\frac{\pi}{2}}\text{erfc}(\frac{\gamma}{\sqrt{2}})}{\sqrt{2\pi}(x-\gamma)^{2}}

where erfc()\text{erfc}(\cdot) denotes the complementary error function.

Proof

To derive the integral representation, simply plug in the formula for E[Xγ]+E[X-\gamma]^{+}, then utilize the definition of the PDF and CDF. The gradient calculation is a standard calculus exercise. ∎

3.2 LogNormal

Assume XLogNormal(μ,s)X\sim LogNormal(\mu,s). Recall that LogNormal parameters have range μ\mu\in\mathbb{R}, s>0s>0, with E[X]=eμ+s22E[X]=e^{\mu+\frac{s^{2}}{2}} and σ2(X)=(es21)e2μ+s2\sigma^{2}(X)=(e^{s^{2}}-1)e^{2\mu+s^{2}} and that the LogNormal CDF, PDF, and quantile function are given by,

F(x)=12[1+erf(lnxμs2)],f(x)=1xs2πe(lnxμ)22s2,F(x)=\frac{1}{2}\left[1+\text{erf}\left(\frac{\ln x-\mu}{s\sqrt{2}}\right)\right],\quad f(x)=\frac{1}{xs\sqrt{2\pi}}e^{\frac{-(\ln x-\mu)^{2}}{2s^{2}}},
qα(X)=eμ+s2erf1(2α1).\quad q_{\alpha}(X)=e^{\mu+s\sqrt{2}\text{erf}^{-1}(2\alpha-1)}\;.
Proposition 9

If XLogNormal(μ,s)X\sim LogNormal(\mu,s), then

q¯α(X)=12eμ+s22[1+erf(s2erf1(2α1))]1α.\bar{q}_{\alpha}(X)=\frac{1}{2}e^{\mu+\frac{s^{2}}{2}}\frac{\left[1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2\alpha-1)\right)\right]}{1-\alpha}.
Proof

We simply evaluate the integral of the quantile function as follows.

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=11αα1eμ+s2erf1(2p1)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}e^{\mu+s\sqrt{2}\text{erf}^{-1}(2p-1)}dp
=eμ1αα1es2erf1(2p1)𝑑p\displaystyle=\frac{e^{\mu}}{1-\alpha}\int_{\alpha}^{1}e^{s\sqrt{2}\text{erf}^{-1}(2p-1)}dp
=eμ1α[12es22(1+erf(s2erf1(2p1)))]p=α1\displaystyle=\frac{e^{\mu}}{1-\alpha}\left[-\frac{1}{2}e^{\frac{s^{2}}{2}}\left(1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2p-1)\right)\right)\right]_{p=\alpha}^{1}
=eμ1α[12es22+12es22(1+erf(s2erf1(2p1)))]\displaystyle=\frac{e^{\mu}}{1-\alpha}\left[\frac{1}{2}e^{\frac{s^{2}}{2}}+\frac{1}{2}e^{\frac{s^{2}}{2}}\left(1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2p-1)\right)\right)\right]
=12eμ+s22[1+erf(s2erf1(2α1))]1α.\displaystyle=\frac{1}{2}e^{\mu+\frac{s^{2}}{2}}\frac{\left[1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2\alpha-1)\right)\right]}{1-\alpha}.

3.3 Logistic

Assume XLogistic(μ,s)X\sim Logistic(\mu,s). Recall that Logistic parameters have range μ\mu\in\mathbb{R}, s>0s>0, with E[X]=μE[X]=\mu and σ2(X)=s2π23\sigma^{2}(X)=\frac{s^{2}\pi^{2}}{3} and that the Logistic CDF, PDF, and quantile function are given by,

F(x)=11+exμs,f(x)=exμss(1+exμs)2,qα(X)=μ+sln(α1α).F(x)={\frac{1}{1+e^{{-{\frac{x-\mu}{s}}}}}}\;,\quad f(x)={\frac{e^{{-{\frac{x-\mu}{s}}}}}{s\left(1+e^{{-{\frac{x-\mu}{s}}}}\right)^{2}}}\;,\quad q_{\alpha}(X)=\mu+s\ln\left(\frac{\alpha}{1-\alpha}\right)\;.

Here, we derive a closed-form expression for the superquantile for the logistic distribution and derive a simple root finding problem for calculating bPOE. We also find that these quantities have a correspondence with the binary entropy function.

Proposition 10

If XLogistic(μ,s)X\sim Logistic(\mu,s), then

q¯α(X)=μ+sH(α)1α\bar{q}_{\alpha}(X)=\mu+\frac{sH(\alpha)}{1-\alpha}

where H(α)H(\alpha) is the binary entropy function H(α)=αln(α)(1α)ln(1α)H(\alpha)=-\alpha\ln(\alpha)-(1-\alpha)\ln(1-\alpha). Furthermore, for any xμx\geq\mu, if α\alpha solves the equation,

H(α)1α=xμs,\frac{H(\alpha)}{1-\alpha}=\frac{x-\mu}{s}\;,

then p¯x(X)=1α\bar{p}_{x}(X)=1-\alpha. Additionally, p¯x(X)=1α\bar{p}_{x}(X)=1-\alpha if α\alpha is the solution to the transformed system,

(1α)αα1α=e(xμs).(1-\alpha)\alpha^{\frac{\alpha}{1-\alpha}}=e^{-\left(\frac{x-\mu}{s}\right)}\;.

Note that both functions H(α)1α\frac{H(\alpha)}{1-\alpha} and (1α)αα1α(1-\alpha)\alpha^{\frac{\alpha}{1-\alpha}} are one-dimensional, convex, and monotonic over the range α[0,1]\alpha\in[0,1], and thus unique solutions exist and can easily be found via root finding methods.

Proof

To obtain the superquantile, we have

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=11αα1μ+sln(α1α)dp\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu+s\ln\left(\frac{\alpha}{1-\alpha}\right)dp
=μ+s1αα1ln(p)ln(1p)dp\displaystyle=\mu+\frac{s}{1-\alpha}\int_{\alpha}^{1}\ln(p)-\ln(1-p)dp
=μ+s1α(α1ln(p)𝑑p+α1ln(1p)dp)\displaystyle=\mu+\frac{s}{1-\alpha}\left(\int_{\alpha}^{1}\ln(p)dp+\int_{\alpha}^{1}-\ln(1-p)dp\right)

Utilizing simple substitution as well as the identity ln(y)𝑑y=yln(y)y+C\int\ln(y)dy=y\ln(y)-y+C, we get

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =μ+s1α(1αlnα+α(1α)ln(1α)+(1α))\displaystyle=\mu+\frac{s}{1-\alpha}\left(-1-\alpha\ln\alpha+\alpha-(1-\alpha)\ln(1-\alpha)+(1-\alpha)\right)
=μ+s1α(αlnα(1α)ln(1α))\displaystyle=\mu+\frac{s}{1-\alpha}\left(-\alpha\ln\alpha-(1-\alpha)\ln(1-\alpha)\right)
=μ+s1αH(α).\displaystyle=\mu+\frac{s}{1-\alpha}H(\alpha)\;.

To get bPOE, we simply follow the bPOE definition, needing to find α\alpha which solves μ+s1αH(α)=x\mu+\frac{s}{1-\alpha}H(\alpha)=x. The transformed system arises from combining logarithms within the superquantile formula and applying exponential transformations. ∎

We can also utilize the minimization formula to calculate bPOE. Calculating bPOE in this way has the added benefit of simultaneously calculating the quantile q1p¯x(X)(X)q_{1-\bar{p}_{x}(X)}(X).

Proposition 11

If XLogistic(μ,s)X\sim Logistic(\mu,s), then

p¯x(X)=minγ<xsln(1+e(γμs))xγ,\bar{p}_{x}(X)=\min_{\gamma<x}\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{x-\gamma}\;,

which is a convex optimization problem over the range γ(,x)\gamma\in(-\infty,x). Furthermore, the minimum occurs at γ\gamma such that,

sln(1+e(γμs))xγ=1F(γ).\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{x-\gamma}=1-F(\gamma)\;.
Proof

This follows from the fact that E[Xγ]+=γ(1F(t))𝑑tE[X-\gamma]^{+}=\int_{\gamma}^{\infty}(1-F(t))dt. Evaluating this integral for XLogistic(μ,s)X\sim Logistic(\mu,s) yields, E[Xγ]+=sln(1+e(γμs))E[X-\gamma]^{+}=s\ln(1+e^{-(\frac{\gamma-\mu}{s})}) which can then be plugged into the minimization formula for bPOE. The second part of the proposition follows from the fact that the gradient of the objective function w.r.t. γ\gamma is given by,

sln(1+e(γμs))(xγ)2e(γμs)(xγ)(1+e(γμs)).\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{(x-\gamma)^{2}}-\frac{e^{-(\frac{\gamma-\mu}{s})}}{(x-\gamma)\left(1+e^{-(\frac{\gamma-\mu}{s})}\right)}\;.

Setting this gradient to zero and simplifying yields the stated optimality condition. ∎

3.4 Studentt-t

Assume XStudentX\sim Student-t(ν,s,μ)(\nu,s,\mu). Recall that Studentt-t parameters have range ν>0\nu>0, s>0s>0, μ>0\mu>0 with E[X]=μE[X]=\mu and σ2(X)=s2νν2\sigma^{2}(X)=\frac{s^{2}\nu}{\nu-2}, and that the Studentt-t CDF and PDF are given by,

F(x)=112ν(x)(ν2,12),f(x)=Γ(ν+12)Γ(ν2)νπs(1+(xμ)2νs2)(ν+1)2,F(x)=1-\frac{1}{2}\mathcal{I}_{\nu(x)}\left(\frac{\nu}{2},\frac{1}{2}\right)\;,\quad f(x)=\frac{\Gamma(\frac{\nu+1}{2})}{\Gamma(\frac{\nu}{2})\sqrt{\nu\pi}s}\left(1+\frac{(x-\mu)^{2}}{\nu s^{2}}\right)^{\frac{-(\nu+1)}{2}}\;,

where ν(x)=νxμs+ν\nu(x)=\frac{\nu}{\frac{x-\mu}{s}+\nu}, t(a,b)\mathcal{I}_{t}(a,b) is the regularized incomplete Beta function, and Γ(a)\Gamma(a) is the Gamma function. Note that a general closed-form expression for qα(X)q_{\alpha}(X) is not known but is a readily available function within common software packages like EXCEL.

Proposition 12

If XStudentX\sim Student-t(ν,s,μ)(\nu,s,\mu), then

q¯α(X)=μ+s(ν+T1(α)2(ν1)(1α))τ(T1(α))\bar{q}_{\alpha}(X)=\mu+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)(1-\alpha)}\right)\tau(T^{-1}(\alpha))

where T1(α)T^{-1}(\alpha) is the inverse of the standardized Studentt-t CDF and τ(x)\tau(x) is standardized Studentt-t PDF.

Proof

Since there is no closed-form expression for the quantile, we utilize the representation of the superquantile given by 11αqα(X)tf(t)𝑑t\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}tf(t)dt. To evaluate this integral, we first take the derivative of the PDF, giving

df(x)dx=f(x)(xμ)(ν+1)νs2+(xμ).\frac{df(x)}{dx}=\frac{-f(x)(x-\mu)(\nu+1)}{\nu s^{2}+(x-\mu)}.

Rearranging yields,

xf(x)dx=νs2df(x)(ν+1)(xμ)2df(x)(ν+1)+μf(x)dx.xf(x)dx=\frac{-\nu s^{2}df(x)}{(\nu+1)}-\frac{(x-\mu)^{2}df(x)}{(\nu+1)}+\mu f(x)dx.

We can then integrate both sides,

xf(x)𝑑x=νs2f(x)(ν+1)1(ν+1)(xμ)2𝑑f(x)+μF(x).\int xf(x)dx=\frac{-\nu s^{2}f(x)}{(\nu+1)}-\frac{1}{(\nu+1)}\int(x-\mu)^{2}df(x)+\mu F(x).

Integrating by parts, gives us the following form of the middle term,

(xμ)2𝑑f(x)=(xμ)2f(x)2xf(x)𝑑x+2μF(x).\int(x-\mu)^{2}df(x)=(x-\mu)^{2}f(x)-2\int xf(x)dx+2\mu F(x)\;.

Then, finally, after substituting this new expression for the middle term and simplifying, we get

xf(x)𝑑x=(νs2+(xμ)2)(ν1)f(x)+μF(x).\int xf(x)dx=-\frac{(\nu s^{2}+(x-\mu)^{2})}{(\nu-1)}f(x)+\mu F(x).

Taking the definite integral yields,

qα(X)xf(x)𝑑x\displaystyle\int_{q_{\alpha}(X)}^{\infty}xf(x)dx =(limx(νs2+(xμ)2)(ν1)f(x)+limxμF(x))\displaystyle=\left(-\lim_{x\rightarrow\infty}\frac{(\nu s^{2}+(x-\mu)^{2})}{(\nu-1)}f(x)+\lim_{x\rightarrow\infty}\mu F(x)\right)
((νs2+(qα(X)μ)2)(ν1)f(qα(X))+μF(qα(X))).\displaystyle\qquad\qquad\quad-\left(-\frac{(\nu s^{2}+(q_{\alpha}(X)-\mu)^{2})}{(\nu-1)}f(q_{\alpha}(X))+\mu F(q_{\alpha}(X))\right).

It is easy to see that the second limit goes to one and, after applying L’Hopital where necessary, that the first limit goes to zero. This leaves

qα(X)xf(x)𝑑x\displaystyle\int_{q_{\alpha}(X)}^{\infty}xf(x)dx =μ((νs2+(qα(X)μ)2)(ν1)f(qα(X))+μF(qα(X)))\displaystyle=\mu-\left(-\frac{(\nu s^{2}+(q_{\alpha}(X)-\mu)^{2})}{(\nu-1)}f(q_{\alpha}(X))+\mu F(q_{\alpha}(X))\right)
=μ(1α)+(νs2+(qα(X)μ)2(ν1))f(qα(X))\displaystyle=\mu(1-\alpha)+\left(\frac{\nu s^{2}+(q_{\alpha}(X)-\mu)^{2}}{(\nu-1)}\right)f(q_{\alpha}(X))
=μ(1α)+s(ν+T1(α)2(ν1))τ(T1(α)),\displaystyle=\mu(1-\alpha)+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)}\right)\tau(T^{-1}(\alpha)),

where the final step comes from writing the non-standardized quantile qα(X)q_{\alpha}(X) and PDF f(x)f(x) in their standardized form. Then, finally, dividing by 1α)1-\alpha) yields the formula,

q¯α(X)=11αqα(X)xf(x)𝑑x=μ+s(ν+T1(α)2(ν1)(1α))τ(T1(α)).\bar{q}_{\alpha}(X)=\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}xf(x)dx=\mu+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)(1-\alpha)}\right)\tau(T^{-1}(\alpha)).

3.5 Weibull

Assume XWeibull(λ,k)X\sim Weibull(\lambda,k). Recall that Weibull parameters have range λ>0\lambda>0, k>0k>0 with E[X]=λΓ(1+1k)E[X]=\lambda\Gamma(1+\frac{1}{k}) and σ2(X)=λ2[Γ(1+2k)Γ(1+1k)2]\sigma^{2}(X)=\lambda^{2}\left[\Gamma(1+\frac{2}{k})-\Gamma(1+\frac{1}{k})^{2}\right], and that the Weibull CDF, PDF, and quantile function are given by,

F(x)=1e(x/λ)k,f(x)={kλ(xλ)k1e(x/λ)kx0,0x<0,,F(x)=1-e^{-(x/\lambda)^{k}}\;,\quad f(x)=\begin{cases}\frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}}&x\geq 0,\\ 0&x<0,\end{cases}\;,
qα(X)=λ(ln(1α))1/k.\quad q_{\alpha}(X)=\lambda{(-\ln(1-\alpha))}^{1/k}\;.

where Γ(a)=0pa1ep𝑑p\Gamma(a)=\int_{0}^{\infty}p^{a-1}e^{-p}dp is the gamma function.

Proposition 13

If XWeibull(λ,k)X\sim Weibull(\lambda,k), then

q¯α(X)=λ1αΓU(1+1k,ln(1α))\bar{q}_{\alpha}(X)=\frac{\lambda}{1-\alpha}\Gamma_{U}\left(1+\frac{1}{k},-\ln(1-\alpha)\right)

where ΓU(a,b)=bpa1ep𝑑p\Gamma_{U}(a,b)=\int_{b}^{\infty}p^{a-1}e^{-p}dp is the upper incomplete gamma function.

Proof

To calculate the superquantile, we utilize the integral representation, which is

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=11αα1λ(ln(1p))1/k𝑑p.\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\lambda{(-\ln(1-p))}^{1/k}dp\;.

To put this integral into the form of the upper incomplete gamma function, make the change of variable y=ln(1p)y=-\ln(1-p). This gives ey=11pe^{y}=\frac{1}{1-p} and dp=(1p)dy=eydpdp=(1-p)dy=e^{-y}dp with new lower limit of integration αln(1α)\alpha\rightarrow-\ln(1-\alpha) and upper limit of integration 11\rightarrow\infty. Applying to the integral, yields

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =λ1αln(1α)y1/key𝑑y\displaystyle=\frac{\lambda}{1-\alpha}\int_{-\ln(1-\alpha)}^{\infty}{y}^{1/k}e^{-y}dy
=λ1αΓU(1+1k,ln(1α)).\displaystyle=\frac{\lambda}{1-\alpha}\Gamma_{U}\left(1+\frac{1}{k},-\ln(1-\alpha)\right)\;.

3.6 Log-Logistic

Assume XLogLogistic(a,b)X\sim LogLogistic(a,b). Recall that Log-Logistic parameters have range a>0a>0, b>0b>0 with E[X]=aπbcsc(πb)E[X]=a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right) when b>1b>1 and
σ2(X)=a2(2πbcsc(2πb)(πbcsc(πb))2)\sigma^{2}(X)=a^{2}\left(\frac{2\pi}{b}csc\left(\frac{2\pi}{b}\right)-(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right))^{2}\right) when b>2b>2, and that the Log-Logistic CDF, PDF, and quantile function are given by,

F(x)=11+(xa)b,f(x)=(b/a)(x/a)b1(1+(x/a)b)2,qα(X)=a(α1α)1b,F(x)=\frac{1}{1+\left(\frac{x}{a}\right)^{-b}}\;,\quad f(x)={\frac{(b/a)(x/a)^{{b-1}}}{\left(1+(x/a)^{{b}}\right)^{2}}}\;,\quad q_{\alpha}(X)=a\left(\frac{\alpha}{1-\alpha}\right)^{\frac{1}{b}}\;,

where csc()csc(\cdot) is the cosecant function.

Proposition 14

If XLogLogistic(a,b)X\sim LogLogistic(a,b), then

q¯α(X)=a1α(πbcsc(πb)Bα(1b+1,11b))\bar{q}_{\alpha}(X)=\frac{a}{1-\alpha}\left(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-B_{\alpha}\left(\frac{1}{b}+1,1-\frac{1}{b}\right)\right)

where By(A1,A2)=0ypA11(1p)A21𝑑pB_{y}(A_{1},A_{2})=\int_{0}^{y}p^{A_{1}-1}(1-p)^{A_{2}-1}dp is the incomplete beta function.

Proof

To calculate the superquantile, we utilize the integral representation as follows:

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1qp(X)𝑑p\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp
=11α(01qp(X)𝑑p0αqp(X)𝑑p)\displaystyle=\frac{1}{1-\alpha}\left(\int_{0}^{1}q_{p}(X)dp-\int_{0}^{\alpha}q_{p}(X)dp\right)
=11α(E[X]0αqp(X)𝑑p)\displaystyle=\frac{1}{1-\alpha}\left(E[X]-\int_{0}^{\alpha}q_{p}(X)dp\right)
=11α(E[X]a0α(p1p)1b𝑑p).\displaystyle=\frac{1}{1-\alpha}\left(E[X]-a\int_{0}^{\alpha}\left(\frac{p}{1-p}\right)^{\frac{1}{b}}dp\right)\;.

Now, note first that for XLogLogistic(a,b)X\sim LogLogistic(a,b), we have E[X]=aπbcsc(πb)E[X]=a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right). Next, for the incomplete beta function, letting A1=1b+1A_{1}=\frac{1}{b}+1 and A2=11bA_{2}=1-\frac{1}{b}, we can see that

Bα(1b+1,11b)=0αp1b(1p)1b𝑑p.B_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})=\int_{0}^{\alpha}p^{\frac{1}{b}}\,(1-p)^{-\frac{1}{b}}\,dp\;.

Using these two facts, we have,

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11α(E[X]a0α(p1p)1b𝑑p)\displaystyle=\frac{1}{1-\alpha}\left(E[X]-a\int_{0}^{\alpha}\left(\frac{p}{1-p}\right)^{\frac{1}{b}}dp\right)
=11α(aπbcsc(πb)aBα(1b+1,11b))\displaystyle=\frac{1}{1-\alpha}\left(a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-aB_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})\right)
=a1α(πbcsc(πb)Bα(1b+1,11b)).\displaystyle=\frac{a}{1-\alpha}\left(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-B_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})\right)\;.

3.7 Generalized Extreme Value Distribution

Assume XX follows a Generalized Extreme Value (GEV) Distribution, which we denote as XGEV(μ,s,ξ)X\sim GEV(\mu,s,\xi). Recall that GEV parameters have range μ\mu\in\mathbb{R}, s>0s>0, ξ\xi\in\mathbb{R} with E[X]={μ+s(g11)/ξifξ0,ξ<1,μ+syifξ=0,ifξ1,E[X]=\begin{cases}\mu+s(g_{1}-1)/\xi&{\text{if}}\ \xi\neq 0,\xi<1,\\ \mu+s\,y&{\text{if}}\ \xi=0,\\ \infty&{\text{if}}\ \xi\geq 1,\end{cases} and
σ2(X)={s2(g2g12)/ξ2ifξ0,ξ<12,s2π26ifξ=0,ifξ12,\sigma^{2}(X)=\begin{cases}s^{2}\,(g_{2}-g_{1}^{2})/\xi^{2}&\text{if}\ \xi\neq 0,\xi<\frac{1}{2},\\ s^{2}\,\frac{\pi^{2}}{6}&\text{if}\ \xi=0,\\ \infty&\text{if}\ \xi\geq\frac{1}{2},\end{cases} where gk=Γ(1kξ)g_{k}=\Gamma(1-k\xi) and yy is the Euler-Mascheroni constant.

Additionally, recall that the GEV has CDF, PDF, and quantile function given by,

F(x)={e(1+ξ(xμ)s)1ξξ0,ee(xμs)ξ=0,F(x)=\begin{cases}e^{-\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}}}&\xi\neq 0,\\ e^{-e^{-\left(\frac{x-\mu}{s}\right)}}&\xi=0\end{cases}\;,

.

f(x)={1s(1+ξ(xμ)s)1ξ1e(1+ξ(xμ)s)1ξξ0,1se(xμs)ee(xμs)ξ=0,\quad f(x)=\begin{cases}\frac{1}{s}\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}-1}e^{-\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}}}&\xi\neq 0,\\ \frac{1}{s}e^{-\left(\frac{x-\mu}{s}\right)}e^{-e^{-\left(\frac{x-\mu}{s}\right)}}&\xi=0,\end{cases}\;
qα(X)={μ+sξ((ln(1α)ξ1)ξ0,μsln(ln(α))ξ=0.\quad q_{\alpha}(X)=\begin{cases}\mu+\frac{s}{\xi}\left((\ln(\frac{1}{\alpha})^{-\xi}-1\right)&\xi\neq 0,\\ \mu-s\ln(-\ln(\alpha))&\xi=0.\end{cases}\;
Proposition 15

If XGEV(μ,s,ξ)X\sim GEV(\mu,s,\xi), then

q¯α(X)={μ+sξ(1α)[ΓL(1ξ,ln(1α))(1α)]ξ0,μ+s(1α)(y+αln(ln(α))li(α))ξ=0,\bar{q}_{\alpha}(X)=\begin{cases}\mu+\frac{s}{\xi(1-\alpha)}\left[\Gamma_{L}(1-\xi,\ln(\frac{1}{\alpha}))-(1-\alpha)\right]&\xi\neq 0,\\ \mu+\frac{s}{(1-\alpha)}\left(y+\alpha\ln(-\ln(\alpha))-li(\alpha)\right)&\xi=0,\end{cases}

where ΓL(a,b)=0bpa1ep𝑑p\Gamma_{L}(a,b)=\int_{0}^{b}p^{a-1}e^{-p}dp is the lower incomplete gamma function, li(x)=0α1lnp𝑑pli(x)=\int_{0}^{\alpha}\frac{1}{\ln p}dp is the logarithmic integral function, and yy is the Euler-Mascheroni constant.

Proof

Assume we have ξ=0\xi=0. Then, we have

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1μsln(ln(p))dp\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu-s\ln(-\ln(p))dp
=μs1α(01ln(ln(p))𝑑p0αln(ln(p))𝑑p)\displaystyle=\mu-\frac{s}{1-\alpha}\left(\int_{0}^{1}\ln(-\ln(p))dp-\int_{0}^{\alpha}\ln(-\ln(p))dp\right)
=μs1α(y0αln(ln(p))𝑑p)\displaystyle=\mu-\frac{s}{1-\alpha}\left(y-\int_{0}^{\alpha}\ln(-\ln(p))dp\right)
=μs1α(y+αln(ln(α))li(α))\displaystyle=\mu-\frac{s}{1-\alpha}\left(y+\alpha\ln(-\ln(\alpha))-li(\alpha)\right)

Assume now that ξ0\xi\neq 0. Then, we have that,

q¯α(X)\displaystyle\bar{q}_{\alpha}(X) =11αα1μ+sξ((ln(1p)ξ1)dp\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu+\frac{s}{\xi}\left((\ln(\frac{1}{p})^{-\xi}-1\right)dp
=μsξ(1α)α1((ln(1p)ξ1)dp\displaystyle=\mu-\frac{s}{\xi(1-\alpha)}\int_{\alpha}^{1}\left((\ln(\frac{1}{p})^{-\xi}-1\right)dp
=μ+sξ(1α)[ΓL(1ξ,ln(1α))(1α)]\displaystyle=\mu+\frac{s}{\xi(1-\alpha)}\left[\Gamma_{L}(1-\xi,\ln(\frac{1}{\alpha}))-(1-\alpha)\right]

4 Portfolio Optimization

A common parametric approach to portfolio optimization is to assume that portfolio returns follow some specified distribution. In this context, particularly when taking a risk averse approach, closed-form representations of the superquantile and bPOE for the specified distribution allow one to formulate a tractable portfolio optimization problem. In this section, we show that our derived formulas for the superquantile and bPOE reveal important properties about portfolio optimization problems formulated with particular distributional assumptions placed upon portfolio returns.

Portfolio optimization with the superquantile is common, so we begin by simply pointing out which of the closed-form superquantile formulas yield tractable portfolio optimization problems. Portfolio optimization with bPOE, however, is not common and we show that it can be advantageous compared to the superquantile approach. In particular, superquantile optimization requires that one sets the probability level α\alpha. One can then observe that for fixed α\alpha, the optimal superquantile portfolio may change based upon the distribution utilized to model returns. We show that if portfolio returns are assumed to follow a Laplace, Logistic, Normal, or Student-tt distribution, the minimal bPOE portfolio’s for fixed threshold xx are the same regardless of the distribution chosen, meaning that there exists a single portfolio that is xx-bPOE optimal for multiple choices of distribution.

Note that in this section we will be dealing with asset returns RR, as it is typical for financial problems, and the loss is the opposite of return: X=RX=-R, and qα(X)=q1α(R)q_{\alpha}(X)=-q_{1-\alpha}(R).

The portfolio optimization problem consists of finding a vector of asset weights wnw\in\mathbb{R}^{n} for a set of nn assets with unknown random returns R=[R1,R2,,Rn]R=[R_{1},R_{2},...,R_{n}] that solves the following optimization problem,

maxwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }} L(w,R)\displaystyle L(w,R) (1)
s.t.\displaystyle s.t. gi(w,R)0,i=1,,I\displaystyle g_{i}(w,R)\leq 0,i=1,.,I
hj(w,R)=0,j=1,,J\displaystyle h_{j}(w,R)=0,j=1,.,J
wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

where L(w,R)L(w,R) is some function to be maximized,444Or minimized if we consider the negative. functions gj(w,R)g_{j}(w,R) and hi(w,R)h_{i}(w,R) enforce inequality and equality constraints respectively, and vectors l,ul,u enforce upper and lower bounds on the individual asset weights. A simple example is the standard Markowitz optimization problem where we maximize the expected utility, which is a weighted combination of the expected return and its variance via a positive trade-off parameter λ0\lambda\geq 0:

maxwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }} wTηλwTΣw\displaystyle w^{T}\eta-\lambda w^{T}\Sigma w (2)
s.t.\displaystyle s.t. wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

An important aspect of the random portfolio return wTRw^{T}R which can be seen within the Markowitz problem and will be used later on in this section is the fact that the expectation E[wTR]E[w^{T}R] and variance σ2(wTR)\sigma^{2}(w^{T}R) are given by wTηw^{T}\eta and wTΣww^{T}\Sigma w respectively, where ηn\eta\in\mathbb{R}^{n} is the vector of expected returns for the nn assets and Σn×n\Sigma\in\mathbb{R}^{n\times n} is the covariance matrix for the nn assets. This allows us to represent the expected value and variance of the portfolio return in terms of ww, and consequently to formulate an optimization problem with decision vector ww.

4.1 Superquantile and bPOE Optimization with Qualified Distributions

As we are dealing with asset returns, and not losses, we need to define the superquantile using that notation. The superquantile is the expected loss above the quantile (conditional expected value of losses in the right tail), so in terms of returns it would be the conditional expected value of returns in the left tail, which can be described by the left superquantile:

q~1α(R)=11α01αqp(R)𝑑p.-\tilde{q}_{1-\alpha}(R)=\frac{1}{1-\alpha}\int_{0}^{1-\alpha}q_{p}(R)dp.

We can use the closed-form superquantile formulas derived in the previous sections for the right superquantile q¯α(R)\bar{q}_{\alpha}(R) to calculate the left superquantile q~α(R)\tilde{q}_{\alpha}(R), as

αq~α(R)+(1α)q¯α(R)=01qp(R)𝑑p=E[R],\alpha\tilde{q}_{\alpha}(R)+(1-\alpha)\bar{q}_{\alpha}(R)=\int_{0}^{1}q_{p}(R)dp=E[R],

so

q~1α(R)=11α(E[R]αq¯1α(R)).-\tilde{q}_{1-\alpha}(R)=-\frac{1}{1-\alpha}(E[R]-{\alpha}\bar{q}_{1-\alpha}(R)).

Since q~1α(R)=q¯α(X)-\tilde{q}_{1-\alpha}(R)=\bar{q}_{\alpha}(X), bPOE is defined as

p¯x(X)={1α|q¯α(X)=x}={1α|q~1α(R)=x}.\bar{p}_{x}(X)=\{1-\alpha|\bar{q}_{\alpha}(X)=x\}=\{1-\alpha|\tilde{q}_{1-\alpha}(R)=-x\}.

4.1.1 Qualified Distributions for Portfolio Optimization

The superquantile or bPOE portfolio optimization problem has its objective function or one of the constraints defined in terms of q~1α(wTR)\tilde{q}_{1-\alpha}(w^{T}R) or p¯x(wTR)\bar{p}_{x}(w^{T}R). To formulate such a problem using a given distribution, we begin by defining a set of qualified distribution’s which we will consider. These qualified distribution’s satisfy the following set of conditions, which allow us to verify that they make sense in terms of portfolio theory and admit superquantile/bPOE expression in a way that can represented in terms of decision variable ww:

Definition 1 (Qualified Distribution)

A qualified distribution 𝒟\mathcal{D} satisfies the following conditions:
(C1) wTR𝒟q~1α(wTR)=wTηwTΣwζ(α,Θ)w^{T}R\sim\mathcal{D}\implies\tilde{q}_{1-\alpha}(w^{T}R)=w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta), where ζ(α,Θ)\zeta(\alpha,\Theta) is a function depending only upon α\alpha and possibly a set of fixed parameters Θ\Theta that do not depend on ww, η\eta is the vector of the expected asset returns, and Σ\Sigma is the covariance matrix for asset returns.
(C2) The statistical parameters of the distribution 𝒟\mathcal{D} must be consistent with the descriptive statistics of real-life asset returns.
(C3) The shape of the PDF for the given distribution 𝒟\mathcal{D} must conform to the shape of the empirical PDF of typical real-life asset returns.

Why should we enforce these preconditions? Condition (C1) guarantees that the superquantile can be expressed in terms of ww. This is necessary to express the superquantile optimization problem. For example, if we assume that wTRLogistic(μ,s)w^{T}R\sim Logistic(\mu,s), we need to be able to express μ\mu and ss in terms of ww. Since μ=E[wTR]=wTη\mu=E[w^{T}R]=w^{T}\eta and wTΣw=σ2(wTR)=s2π23w^{T}\Sigma w=\sigma^{2}(w^{T}R)=\frac{s^{2}\pi^{2}}{3}, we have

q~1α(R)\displaystyle\tilde{q}_{1-\alpha}(R) =11α(E[R]αq¯1α(R))=11α(μα[μ+sα((1α)ln(1α)αln(α))])\displaystyle=\frac{1}{1-\alpha}(E[R]-{\alpha}\bar{q}_{1-\alpha}(R))=\frac{1}{1-\alpha}(\mu-\alpha[\mu+\frac{s}{\alpha}(-(1-\alpha)ln(1-\alpha)-\alpha ln(\alpha))])
=μs1α(αln(α)(1α)ln(1α))\displaystyle=\mu-\frac{s}{1-\alpha}(-\alpha ln(\alpha)-(1-\alpha)ln(1-\alpha))
=wTηwTΣw3(αln(α)(1α)ln(1α))π(1α),\displaystyle=w^{T}\eta-\sqrt{w^{T}\Sigma w}\frac{\sqrt{3}(-\alpha\ln(\alpha)-(1-\alpha)\ln(1-\alpha))}{\pi(1-\alpha)}\;,

which satisfies (C1). Other examples that satisfy this condition are Laplace, Normal, Exponential, Studentt-t, Pareto, GPD, and GEV. Note that for Studentt-t, we assume that parameter ν\nu is fixed and the same for all assets, i.e. Θ={ν}\Theta=\{\nu\}, and for GPD/GEV distribution’s Θ={ξ}\Theta=\{\xi\}.

Condition (C2) and (C3) are simple sanity checks on our model assumptions. For example, for exponential distribution E[R]=1λ=σ(R)E[R]=\frac{1}{\lambda}=\sigma(R), however for the real-life asset returns the sample mean is not generally equal to the sample standard deviation. So, Exponential and Pareto distribution’s make no sense in portfolio optimization problems even if they satisfy (C1). As for (C3), a distribution is not practical if there is obvious discrepancy between the shape of its PDF and the shape of the empirical PDF observed using real-life asset returns. The latter is generally bell-shaped or, more likely, inverse-V shaped, and is never shaped like the PDF of an Exponential, Pareto/GPD, or Weibull for k<1k<1.

This leaves us with a set of four elliptical distribution’s which satisfy all three conditions: Logistic, Laplace, Normal, and Studentt-t, as well as the nonelliptical GEV distribution. For the latter, with ξ0\xi\neq 0 the left superquantile can be expressed as

q~1α(R)=wTνwTΣwαΓ(1ξ)ΓU(1ξ,ln(11α))(1α)g2g12,\tilde{q}_{1-\alpha}(R)=w^{T}\nu-\sqrt{w^{T}\Sigma w}\frac{\alpha\Gamma(1-\xi)-\Gamma_{U}(1-\xi,ln(\frac{1}{1-\alpha}))}{(1-\alpha)\sqrt{g_{2}-g_{1}^{2}}},

where ΓU(a,b)=bpa1ep𝑑p\Gamma_{U}(a,b)=\int_{b}^{\infty}p^{a-1}e^{-p}dp is the upper incomplete Gamma function, gk=Γ(1kξ)g_{k}=\Gamma(1-k\xi).

4.1.2 Superquantile and bPOE Optimization

An alternative to the Markowitz problem is to find the portfolio with minimal superquantile (3) or bPOE (4).

minwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{min }} q~1α(wTR)\displaystyle-\tilde{q}_{1-\alpha}(w^{T}R) (3)
s.t.\displaystyle s.t. wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u
minwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{min }} p¯x(wTR)\displaystyle\bar{p}_{x}(w^{T}R) (4)
s.t.\displaystyle s.t. wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

For qualified distribution’s, however, these problems can be greatly simplified. First, we see that (3) reduces to (5):

maxwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }} wTηwTΣwζ(α,Θ)\displaystyle w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta) (5)
s.t.\displaystyle s.t. wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

Khokhlov (2016) shows that the optimal solution to (5) is the same as the optimal solution to a Markowitz optimization problem (2) with λ=ζ(α,Θ)2σ(wTR)\lambda=\frac{\zeta(\alpha,\Theta)}{2\sigma(w^{T}R)}. Thus, the superquantile optimal portfolio is also mean-variance optimal in the Markowitz sense.

Now, for bPOE we see that the picture is actually much simpler. Specifically, we have the following proposition.

Proposition 16

If we assume that wTR𝒟w^{T}R\sim\mathcal{D} and we have that 𝒟\mathcal{D} is a qualified distribution, then (4) reduces to (6).

maxwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }} wTη+xwTΣw\displaystyle\frac{w^{T}\eta+x}{\sqrt{w^{T}\Sigma w}} (6)
s.t.\displaystyle s.t. wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u
Proof

First, note that by definition of the superquantile, we know that ζ(α,Θ)\zeta(\alpha,\Theta) must be an increasing function w.r.t. α[0,1]\alpha\in[0,1]. Second, as p¯x(X)={1α|q~1α(R)=x}\bar{p}_{x}(X)=\{1-\alpha|\tilde{q}_{1-\alpha}(R)=-x\} and q~1α(wTR)=wTηwTΣwζ(α,Θ)\tilde{q}_{1-\alpha}(w^{T}R)=w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta) for qualified distribution’s, the problem (4) can be rewritten as:

minwn,α\displaystyle\underset{w\in\mathbb{R}^{n},\alpha}{\text{min }} 1α\displaystyle 1-\alpha (7)
s.t.\displaystyle s.t. wTη+wTΣwζ(α,Θ)=x\displaystyle-w^{T}\eta+\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta)=x
wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

which can then be written as:

maxwn\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }} α\displaystyle\alpha (8)
s.t.\displaystyle s.t. ζ(α,Θ)=wTη+xwTΣw\displaystyle\zeta(\alpha,\Theta)=\frac{w^{T}\eta+x}{\sqrt{w^{T}\Sigma w}}
wT1=1\displaystyle w^{T}1=1
lwu\displaystyle l\leq w\leq u

Finally, since ζ(α,Θ)\zeta(\alpha,\Theta) is an increasing function w.r.t. α\alpha and Θ\Theta does not dependent upon ww, we see that we can formulate the maximization as (6) without changing the argmin. ∎

This proposition has the important implication for portfolio theory that the optimal bPOE portfolio for the qualified distribution does not depend on the distribution itself. The same portfolio will have the lowest bPOE for any of those distribution’s. The fact that bPOE optimization is, in some sense, independent from distributional assumptions makes it preferable to superquantile optimization.

4.1.3 Numerical Demonstration

In this example, we consider a global equity portfolio that consists of 6 market portfolios - U.S., Japan, U.K., Germany, France, and Switzerland, represented by the corresponding MSCI indices - MXUS, MXJP, MXGB, MXDE, MXFR, MXCH. Parameters of returns for portfolio components are provided in Table 1 (source: Capital IQ sample of monthly returns from April 1987 to April 1996, annualized).

Table 1: Portfolio Return Data
Asset Expected Standard Correlations
ticker return deviation MXUS MXJP MXGB MXDE MXFR MXCH
MXUS 10.25% 13.79% 1 0.190041 0.639133 0.481857 0.499406 0.605384
MXJP 6.90% 26.05% 0.190041 1 0.450337 0.251601 0.378753 0.373964
MXGB 8.81% 19.16% 0.639133 0.450337 1 0.579918 0.584215 0.654687
MXDE 9.15% 20.31% 0.481857 0.251601 0.579918 1 0.753072 0.628426
MXFR 8.83% 20.40% 0.499406 0.378753 0.584215 0.753072 1 0.580626
MXCH 13.85% 17.45% 0.605384 0.373964 0.654687 0.628426 0.580626 1

This problem was solved using a non-linear programming algorithm and the results are provided in Table 2. The respective values of λ\lambda are also provided, which allows deriving the same portfolios using the standard MVO solver that uses a quadratic programming algorithm.

Table 2: Optimal Superquantile (CVaR) Portfolio’s
Asset min risk CVaR 99% optimal portfolios CVaR 95% optimal portfolios
ticker portfolio normal t (df=3) Laplace logistic normal t (df=3) Laplace logistic
MXUS 70.99% 65.80% 67.59% 67.03% 66.53% 64.23% 64.78% 65.05% 64.64%
MXJP 13.98% 9.61% 11.11% 10.64% 10.21% 8.28% 8.74% 8.97% 8.62%
MXGB 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
MXDE 9.24% 2.87% 5.07% 4.37% 3.76% 0.95% 1.61% 1.94% 1.44%
MXFR 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
MXCH 5.79% 21.72% 16.22% 17.96% 19.50% 26.54% 24.87% 24.04% 25.30%
Return 9.89% 10.68% 10.40% 10.49% 10.57% 10.91% 10.83% 10.79% 10.85%
St.dev. 12.86% 13.01% 12.93% 12.95% 12.97% 13.11% 13.08% 13.06% 13.09%
λ\lambda 20.48 31.28 26.82 23.80 15.73 17.11 17.88 16.73

Table 2 shows that superquantile optimal portfolios are not the same as the global minimum variance portfolio (min. risk portfolio), but are quite close to it. Distributional assumptions play their role in the optimal portfolios composition, with the Student-t distribution rendering the most conservative allocation for CVaR 99%. However, the differences between optimal portfolios for CVaR 95% are insignificant.

We can note from (5) that if portfolio return is constrained from below, unless this constraint is very close to the return of the global minimum variance portfolio, it results in the superquantile optimization being essentially the same as the variance minimization. If the risk is constrained from above, that superquantile optimization is the same as return maximization.

Using the same set of assets, we also solved the bPOE optimization problem (4) with thresholds x=0.16x=0.16 and x=0.25x=0.25 (i.e. losses exceeding 16% and 25% of the initial portfolio respectively), l=0l=0, and u=1u=1. Table 3 shows the results: the minimal bPOE achieved, the optimal portfolio composition and parameters, and CVaR for the optimal portfolios for all distribution’s.

Table 3: Optimal bPOE Portfolio’s
Assumed distribution normal t (df=3) Laplace logistic normal t (df=3) Laplace logistic
bPOE threshold, xx 16% 25%
bPOE value, p¯x(X)\bar{p}_{x}(X) 5.13% 6.21% 7.46% 6.36% 0.80% 2.93% 2.81% 1.86%
Asset tiker bPOE-optimal portfolio composition
MXUS 64.20% 64.19% 64.20% 64.20% 65.95% 65.95% 65.95% 65.95%
MXJP 8.26% 8.27% 8.25% 8.25% 9.73% 9.73% 9.73% 9.73%
MXGB 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
MXDE 0.90% 0.91% 0.90% 0.90% 3.05% 3.05% 3.06% 3.05%
MXFR 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%
MXCH 26.64% 26.63% 26.64% 26.65% 21.27% 21.27% 21.27% 21.27%
Return 10.92% 10.92% 10.92% 10.92% 10.65% 10.65% 10.65% 10.65%
St.dev. 13.12% 13.12% 13.12% 13.12% 13.00% 13.00% 13.00% 13.00%
Test distribution CVaR for the test distribution’s
normal 16.00% 14.93% 13.87% 14.79% 25.00% 18.95% 19.16% 21.16%
t (df=3) 18.14% 16.00% 14.05% 15.74% 46.31% 25.00% 25.56% 31.46%
Laplace 19.48% 17.70% 16.00% 17.48% 36.62% 24.61% 25.00% 28.79%
logistic 17.61% 16.18% 14.81% 16.00% 31.14% 21.71% 22.01% 25.00%

All the optimal solutions were generated by the non-linear programming algorithm for different distribution’s, but the results support the conclusion that the optimal portfolio composition does not depend on the distribution (small discrepancies are due to the optimization algorithm accuracy).

5 Parametric Density Estimation with Superquantile’s

One of the motivation’s for providing closed-form superquantile formulas is so that they can be used within common parametric estimation frameworks. The Exponential, Parteo/GPD, Laplace, Normal, LogNormal, Logistic, Student-t, Weibull, LogLogistic, and GEV represent a wide range of distribution’s that can now be utilized within these parametric procedures, but with superquantile’s incorporated into the fitting criteria. We illustrate this idea by proposing a simple variation of the Method of Moments (MM), which we call the Method of Superquantile’s (MOS), where superquantile’s at varying levels of α\alpha take the place of moments. Our numerical example utilizes a heavy tailed Weibull to illustrate MOS, since it is particularly well-suited for asymmetric heavy-tailed data. However, any of the listed distribution’s could be used as well.

5.1 Method of Superquantile’s

The MM is a well known tool for estimating the parameters of a distribution when moments are available in parametric form and desired moments are either assumed to be known or are measured from empirical observations. It looks for the distribution fΘ(x)f_{\Theta}(x), parameterized by Θ\Theta, with moments equal to some known moments or, if moments are unknown, empirical moments. With nn moments used, the problem reduces to solving a set of nn equations w.r.t. the set of parameters Θ\Theta of the distribution family.

This method, though, can be generalized where moments are replaced by other distributional characteristics, such as the superquantile and quantile. We utilize superquantile’s in this context. This method provides flexibility through the choices of different α\alpha, allowing the user to focus the fitting procedure on particular portions of the distribution. This flexibility is advantageous compared to other methods such as MM or maximum likelihood (ML), since these fitting methods treat each portion of the distribution equally. When fitting the tail is important, for example, and there are many samples around the mean with few samples in the tail, it can be desirable to focus the fitting procedure on carefully fitting the tail samples. As will be shown, one can focus MOS by choice of α\alpha. One will see that this procedure is similar to fitting with Probability Weighted Moments (PWM)555also sometimes called L-moments., but where MOS is much more straightforward with superquantiles far easier to interpret than PWM’s.

We formulate the following problem, where q¯^α(X)\hat{\bar{q}}_{\alpha}(X) denotes either a known superquantile or an empirical estimate from a sample of XX and q¯α(XfΘ)\bar{q}_{\alpha}(X_{f_{\Theta}}) denotes parameterized formulas for the superquantile when XX has density function fΘf_{\Theta} with the set of parameters Θ\Theta:

Method of Superquantiles: Fix α1,,αk[0,1]\alpha_{1},...,\alpha_{k}\in[0,1] and choose a parametric distribution family fΘf_{\Theta} with parameters Θ\Theta. Solve for Θ\Theta such that,

q¯αi(XfΘ)=q¯^αi(X) for all i=1,,k,\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})=\hat{\bar{q}}_{\alpha_{i}}(X)\text{ for all }i=1,...,k,

which is a system of kk equations in |Θ||\Theta| unknowns.

This problem, however, may not have a solution. For example, if k=2k=2 and the parametric family only has a single parameter (i.e. |Θ|=1|\Theta|=1). In this case, one can solve the following surrogate Least Squares minimization problem:

LS Method of Superquantiles (LS-MOS): Fix α1,,αk[0,1]\alpha_{1},...,\alpha_{k}\in[0,1] and choose a parametric distribution family fΘf_{\Theta} with parameters Θ\Theta. Choose weights c1,,ck>0c_{1},...,c_{k}>0 and solve for,

Θargmin Θici(q¯αi(XfΘ)q¯^αi(X))2.\Theta\in\underset{\Theta}{\text{argmin }}\sum_{i}c_{i}\left(\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})-\hat{\bar{q}}_{\alpha_{i}}(X)\right)^{2}\;.

This procedure finds the distribution which has superquantile’s that are close to the empirical superquantile’s. The freedom to select αi\alpha_{i} as well as cic_{i} provides the user with much flexibility as to which portion of the distribution should match more exactly the empirical superquantile’s.

5.1.1 Example Customization: Conservative Tail Fitting

When sample size is small and the tail of the distribution at hand is long, it is likely that the tail will be difficult to characterize from empirical data since few observations will be observed in the tail (with high probability). The proposed method of superquantile’s can easily, however, be made more conservative based upon empirical data in an intuitive way. For example, one could have the following condition where ϵi\epsilon_{i} is a pre-specified constant such that 0<ϵiαi0<\epsilon_{i}\leq\alpha_{i}:

q¯αiϵi(XfΘ)=q¯^αi(X).\bar{q}_{\alpha_{i}-\epsilon_{i}}(X_{f_{\Theta}})=\hat{\bar{q}}_{\alpha_{i}}(X).

Or, for the least squares variant, one can fit the problem,

minΘici(q¯αiϵi(XfΘ)q¯^αi(X))2\min_{\Theta}\sum_{i}c_{i}(\bar{q}_{\alpha_{i}-\epsilon_{i}}(X_{f_{\Theta}})-\hat{\bar{q}}_{\alpha_{i}}(X))^{2}

Notice that these conditions are effectively making the assumption that the empirical superquantile has underestimated the true tail expectation, which is often the case with heavy tailed distribution’s.

5.1.2 Example: Weibull Distribution Fitting

We illustrate the basic method on fitting a Weibull distribution, with Θ=(λ,k)\Theta=(\lambda,k), from a small sample of 50 observations. We took two independent samples, denoted S1,S2S_{1},S_{2}, of size 50 from a Weibull with λ=.5,k=1.4\lambda=.5,k=1.4. We then estimated the Weibull parameters using MM, ML, and the LS-MOS. The MM was solved using the first two moments. The LS-MOS was solved twice. It was first solved with α1=.15,α2=.75,c1=c2=1\alpha_{1}=.15,\alpha_{2}=.75,c_{1}=c_{2}=1, a choice which was made to mimic the behavior of MM and ML, where the fit emphasizes most of the observed data. To put more emphasis on the tail observations, it was also solved with α1=.5,α2=.75,α3=.95,c1=c2=c3=1\alpha_{1}=.5,\alpha_{2}=.75,\alpha_{3}=.95,c_{1}=c_{2}=c_{3}=1. We denote these solutions as LS1, LS2 respectively. The ML solution is available in closed-form and we solved MM, LS1, and LS2 using Scipy’s optimization library.666Specifically, we used the leastsq function which implements MINPACK’s lmdif routine. This routine requires function values and calculates the Jacobian by a forward-difference approximation.

Looking at Figure 2 for S1S_{1} and Figure 3 for S2S_{2}, we see that the LS1 fit is, indeed, much like the MM a ML fit for both data sets. However, we see that the LS2 fit is the best in both cases. The ML, MM, and LS1 methods have put too much emphasis on the observations around the mode. The LS2 fit, however, has put appropriate emphasis on the less frequent observations in the tail.

It is also important to notice how the differences in S1S_{1} and S2S_{2} have affected the fit from each method. Looking at the differences between S1S_{1} and S2S_{2}, we can see that the samples differ in the observed density in the lower portion of the range. This is directly reflected in the fits given by MM, ML, and LS1. Compared to their fits on S1S_{1}, they are more heavily favoring the left side of the distribution. The LS2 fit, however, is robust to these differences between data sets and, by focusing on the tail, has remained mostly unchanged from the fit on S1S_{1}. This is the intended effect from the selection of larger values for α\alpha in LS2.

We duplicated this procedure on a heavier tailed Weibull. We took 50 samples of a Weibull with true parameters k=1,λ=.5k=1,\lambda=.5 and fit MM, ML, LS1, and LS2 using the empirical data. Figure 4 and Figure 5 highlight different aspects of the resulting fits. We see that LS2 clearly provides the best fit, with Figure 5 in particular showing that MM, ML, and LS1 underestimate the tail densities. MM, ML, and LS1 put more emphasis on fitting the observations around the mode. As intended, however, LS2 focuses more on fitting the right tail observations and arrives at a better fit.

[Uncaptioned image]
Figure 2: Fits using sample S1S_{1}. PDF’s displayed with normalized histogram of S1S_{1} sample given in background.
[Uncaptioned image]
Figure 3: Fits using sample S2S_{2}. PDF’s displayed with normalized histogram of S2S_{2} sample given in background.
[Uncaptioned image]
Figure 4: Left side of distribution for fits on sample from Weibull with true k=1,λ=.5k=1,\lambda=.5.
[Uncaptioned image]
Figure 5: Right side of distribution for fits on sample from Weibull with true k=1,λ=.5k=1,\lambda=.5.

5.1.3 Constrained Likelihood and Entropy Maximization

While we focused primarily on a variant of the method of moments, the formulas provided for superquantile’s and bPOE can be used in other parametric procedures. For example, one could consider a constrained variant of the maximum likelihood or maximum entropy method, where superquantile constraints are introduced. Letting H(fΘ)H(f_{\Theta}) denote the entropy of the random variable with density function fΘf_{\Theta}, and yiy_{i} denote an observation, constrained entropy maximization and maximum likelihood can be set up as follows:

ML:maxfΘilog(fΘ(yi)),ME:maxfΘH(fΘ)ML:\;\;\max_{f_{\Theta}\in\mathcal{F}}\sum_{i}\log(f_{\Theta}(y_{i}))\;\;,\;\;ME:\;\;\max_{f_{\Theta}\in\mathcal{F}}H(f_{\Theta})

where ={fΘ|q¯αi(XfΘ)q¯^αi(X)i=1,,k}\mathcal{F}=\{f_{\Theta}\;|\;\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})\leq\hat{\bar{q}}_{\alpha_{i}}(X)\;\;\forall i=1,...,k\}.

While we leave full exploration of this framework for future work, this simple formulation illustrates another potential use for the provided superquantile and bPOE formulas within traditional parametric frameworks.

6 Conclusion

In this paper, we first derived closed-form formulas for the superquantile and bPOE, then utilized them within parametric portfolio optimization and density estimation problems. We were able to derive superquantile formulas for a variety of distribution’s, including ones with exponeitial tails (Exponential, Pareto/GPD, Laplace), symmetric distribution’s (Normal, Laplace, Logistic, Student-t), and asymmetric distribution’s with heavy tails (LogNormal, Weibull, LogLogistic, GEV). With bPOE formulas, while we had less success deriving truly closed-form solutions, we saw that it can still be calculated by solving a one-dimensional convex optimization problem or one-dimensional root finding problem.

We then utilized these formulas to develop two parametric procedures, one in portfolio optimization and one in density estimation. We first found that formulas for Normal, Laplace, Student-t, Logistic, and GEV are all distribution’s which yield tractable superquantile and bPOE portfolio optimization problems. Furthermore, we found that bPOE-optimal portfolios are more robust to changing distributional assumptions compared to superquantile-optimal portfolios. Specifically, bPOE optimal portfolios are optimal, simultaneously, for an entire class of distributions. Finally, we presented a variation on the method of moments where moments are replaced by superquantile’s. This parametric procedure is made possible by our closed-form formula’s and we illustrate its use in the case of heavy tailed assymetric data, where additional emphasis on fitting the tail via superquantile conditions can be highly desirable. We find that this method makes it easy to direct the focus of the fitting procedure on tail samples.

References

  • Andreev et al. (2005) Andreev A, Kanto A, Malo P (2005) On closed-form calculation of cvar. Helsinki School of Economics Working Paper W-389
  • Artzner et al. (1999) Artzner P, Delbaen F, Eber JM, Heath D (1999) Coherent measures of risk. Mathematical Finance 9:203–228
  • Davis and Uryasev (2016) Davis JR, Uryasev S (2016) Analysis of tropical storm damage using buffered probability of exceedance. Natural Hazards 83(1):465–483
  • Karian and Dudewicz (1999) Karian ZA, Dudewicz EJ (1999) Fitting the generalized lambda distribution to data: a method based on percentiles. Communications in Statistics-Simulation and Computation 28(3):793–819
  • Khokhlov (2016) Khokhlov V (2016) Portfolio value-at-risk optimization. Wschodnioeuropejskie Czasopismo Naukowe 13(2):107–113
  • Landsman and Valdez (2003) Landsman ZM, Valdez EA (2003) Tail conditional expectation for elliptical distributions. North American Actuarial Journal 7(4):55–71
  • Mafusalov and Uryasev (2018) Mafusalov A, Uryasev S (2018) Buffered probability of exceedance: Mathematical properties and optimization. SIAM Journal on Optimization 28(2):1077–1103
  • Mafusalov et al. (2018) Mafusalov A, Shapiro A, Uryasev S (2018) Estimation and asymptotics for buffered probability of exceedance. European Journal of Operational Research 270(3):826–836
  • Norton and Uryasev (2016) Norton M, Uryasev S (2016) Maximization of auc and buffered auc in binary classification. Mathematical Programming pp 1–38
  • Norton et al. (2017) Norton M, Mafusalov A, Uryasev S (2017) Soft margin support vector classification as buffered probability minimization. The Journal of Machine Learning Research 18(1):2285–2327
  • Rockafellar and Royset (2010) Rockafellar R, Royset J (2010) On buffered failure probability in design and optimization of structures. Reliability Engineering & System Safety, Vol 95, 499-510
  • Rockafellar and Uryasev (2000) Rockafellar R, Uryasev S (2000) Optimization of conditional value-at-risk. The Journal of Risk, Vol 2, No 3, 2000, 21-41
  • Rockafellar and Royset (2014) Rockafellar RT, Royset JO (2014) Random variables, monotone relations, and convex analysis. Mathematical Programming 148(1-2):297–331
  • Rockafellar and Uryasev (2002) Rockafellar RT, Uryasev S (2002) Conditional value-at-risk for general loss distributions. Journal of banking & finance 26(7):1443–1471
  • Sgouropoulos et al. (2015) Sgouropoulos N, Yao Q, Yastremiz C (2015) Matching a distribution by matching quantiles estimation. Journal of the American Statistical Association 110(510):742–759
  • Shang et al. (2018) Shang D, Kuzmenko V, Uryasev S (2018) Cash flow matching with risks controlled by buffered probability of exceedance and conditional value-at-risk. Annals of Operations Research 260(1-2):501–514
  • Uryasev (2014) Uryasev S (2014) Buffered probability of exceedance and buffered service level: Definitions and properties. Department of Industrial and Systems Engineering, University of Florida, Research Report 3