Calculating CVaR and bPOE for Common Probability Distributions With Application to Portfolio Optimization and Density Estimation

Calculating CVaR and bPOE for Common Probability Distributions With Application to Portfolio Optimization and Density Estimation

1 Introduction

2 Distributions With Closed Form Superquantile and bPOE

3 Distributions With Closed Form Superquantile

4 Portfolio Optimization

5 Parametric Density Estimation with Superquantile’s

6 Conclusion

References

1 Introduction

2 Distributions With Closed Form Superquantile and bPOE

3 Distributions With Closed Form Superquantile

4 Portfolio Optimization

5 Parametric Density Estimation with Superquantile’s

6 Conclusion

References

1.1 Organization of Paper

1.2 Background and Notation

2.1 Exponential

2.2 Pareto

2.3 Generalized Pareto Distribution (GPD)

2.4 Laplace

3.1 Normal

3.2 LogNormal

3.3 Logistic

3.4 Student $-t$

3.5 Weibull

3.6 Log-Logistic

3.7 Generalized Extreme Value Distribution

4.1 Superquantile and bPOE Optimization with Qualified Distributions

5.1 Method of Superquantile’s

1.1 Organization of Paper

1.2 Background and Notation

2.1 Exponential

2.2 Pareto

2.3 Generalized Pareto Distribution (GPD)

2.4 Laplace

3.1 Normal

3.2 LogNormal

3.3 Logistic

3.4 Student−t-t

3.5 Weibull

3.6 Log-Logistic

3.7 Generalized Extreme Value Distribution

4.1 Superquantile and bPOE Optimization with Qualified Distributions

5.1 Method of Superquantile’s

3.4 Student $-t$

4.1.1 Qualified Distributions for Portfolio Optimization

4.1.2 Superquantile and bPOE Optimization

4.1.3 Numerical Demonstration

5.1.1 Example Customization: Conservative Tail Fitting

5.1.2 Example: Weibull Distribution Fitting

5.1.3 Constrained Likelihood and Entropy Maximization

4.1.1 Qualified Distributions for Portfolio Optimization

4.1.2 Superquantile and bPOE Optimization

4.1.3 Numerical Demonstration

5.1.1 Example Customization: Conservative Tail Fitting

5.1.2 Example: Weibull Distribution Fitting

5.1.3 Constrained Likelihood and Entropy Maximization

Abstract

Keywords:

Proposition 1

Proof

Corollary 1

Proof

Proposition 2

Proof

Corollary 2

Proof

Proposition 3

Proof

Proposition 4

Proof

Proposition 5

Proof

Proposition 6

Proof

Proposition 7

Proof

Proposition 8

Proof

Proposition 9

Proof

Proposition 10

Proof

Proposition 11

Proof

Proposition 12

Proof

Proposition 13

Proof

Proposition 14

Proof

Proposition 15

Proof

Definition 1 (Qualified Distribution)

Proposition 16

Proof

Abstract

Keywords:

Proposition 1

Proof

Corollary 1

Proof

Proposition 2

Proof

Corollary 2

Proof

Proposition 3

Proof

Proposition 4

Proof

Proposition 5

Proof

Proposition 6

Proof

Proposition 7

Proof

Proposition 8

Proof

Proposition 9

Proof

Proposition 10

Proof

Proposition 11

Proof

Proposition 12

Proof

Proposition 13

Proof

Proposition 14

Proof

Proposition 15

Proof

Definition 1 (Qualified Distribution)

Proposition 16

Proof

∎

¹¹institutetext: Matthew Norton ²²institutetext: Naval Postgraduate School, Operations Research Department
²²email: mnorton@nps.edu ³³institutetext: V. Khokhlov ⁴⁴institutetext: ⁴⁴email: vkhokhlov.embals2016@london.edu ⁵⁵institutetext: S. Uryasev ⁶⁶institutetext: University of Florida, Department of Industrial and Systems Engineering, Risk Management and Financial Engineering Laboratory
⁶⁶email: uryasev@ufl.edu

Matthew Norton Valentyn Khokhlov Stan Uryasev

Conditional Value-at-Risk (CVaR) and Value-at-Risk (VaR), also called the superquantile and quantile, are frequently used to characterize the tails of probability distribution’s and are popular measures of risk in applications where the distribution represents the magnitude of a potential loss. Buffered Probability of Exceedance (bPOE) is a recently introduced characterization of the tail which is the inverse of CVaR, much like the CDF is the inverse of the quantile. These quantities can prove very useful as the basis for a variety of risk-averse parametric engineering approaches. Their use, however, is often made difficult by the lack of well-known closed-form equations for calculating these quantities for commonly used probability distribution’s. In this paper, we derive formulas for the superquantile and bPOE for a variety of common univariate probability distribution’s. Besides providing a useful collection within a single reference, we use these formulas to incorporate the superquantile and bPOE into parametric procedures. In particular, we consider two: portfolio optimization and density estimation. First, when portfolio returns are assumed to follow particular distribution families, we show that finding the optimal portfolio via minimization of bPOE has advantages over superquantile minimization. We show that, given a fixed threshold, a single portfolio is the minimal bPOE portfolio for an entire class of distribution’s simultaneously. Second, we apply our formulas to parametric density estimation and propose the method of superquantile’s (MOS), a simple variation of the method of moment’s (MM) where moment’s are replaced by superquantile’s at different confidence levels. With the freedom to select various combinations of confidence levels, MOS allows the user to focus the fitting procedure on different portions of the distribution, such as the tail when fitting heavy-tailed asymmetric data.

Conditional Value-at-Risk Buffered Probability of Exceedance Superquantile Density Estimation Portfolio Optimization

When faced with randomness and uncertainty, some of the most popular techniques for dealing with such randomness are parametric in nature. Given a real valued random variable $X$ , analysis can be greatly simplified if one assumes that $X$ belongs to a specific parametric family of distribution’s. For example, the Method of Moment’s (MM) is one of the simplest and most widely used methods for parametric density estimation. These techniques, however, often require that certain characteristics of the distribution family be representable by a simple, ideally closed-form, expression. For example, traditional MM uses closed-form expression’s for the moments of the parametric distribution family. Similarly, the Matching of Quantile’s (MOQ) procedure (see e.g., Sgouropoulos et al. (2015); Karian and Dudewicz (1999)) uses expression’s for the quantile function. In portfolio optimization, the availability of simple expression’s for the mean and variance of portfolio returns yields a tractable Markowitz portfolio optimization problem.¹¹1See Section 4 for specifics. For a variety of problem’s, application of a parametric method relies upon the availability of a closed-form expression for a specific characteristic of the parametric family of interest.

Luckily, for a variety of distribution’s, closed-from expression’s are available for commonly utilized characteristic’s. These include characteristic’s such as the moment’s, the quantile, and the CDF. Over the past two decades, however, new fundamental characteristic’s like the superquantile have emerged from the field of quantitative risk management with important applications across engineering fields like financial, civil, and environmental engineering. (see e.g., Rockafellar and Royset (2010); Rockafellar and Uryasev (2000); Davis and Uryasev (2016)). Furthermore, closed-form expressions for these characteristics, for a large variety of common parametric distribution families, have not been widely disseminated. While emerging from specific engineering applications, some of these characteristics are very general and can be viewed as fundamental aspects of a random variable just like the mean or quantile. Thus, utilization of these characteristics within parametric methods is a natural consideration. To facilitate their use, however, we must develop closed-form expressions.²²2When closed-form expressions are not available, we look to provide simple calculation methods that might still be utilized within parametric methods.

We focus on developing these expressions for the superquantile and Buffered Probability of Exceedance (bPOE) for a variety of distribution families. Developments in financial risk theory over the last two decades have heavily emphasized measurement of tail risk. After Artzner et al. (1999) introduced the concept of a coherent risk measure, Rockafellar and Uryasev (2000) introduced the superquantile, also called Conditional Value at Risk (CVaR) in the financial literature. This began to be considered a preferable characterization of tail risk compared to the quantile, or Value-at-Risk (VaR). While some closed-form expression are available to use the superquantile within parametric procedures, see e.g., Rockafellar and Uryasev (2000); Landsman and Valdez (2003); Andreev et al. (2005), the variety of distribution’s discussed within each of these sources is limited.

We illustrate that for a variety of common distribution’s, straightforward techniques such as integration of the quantile function obtain a closed-form expression for the superquantile that is easy to use within subsequent parametric methods. We attempt to include a variety, providing superquantile formulas for the Exponential, Pareto/Generalized Pareto (GPD), Laplace, Normal, LogNormal, Logistic, LogLogistic, Generalized Student-t, Weibull, and Generalized Extreme Value (GEV) distribution’s. These provide examples varying from the exponentially tailed (Exponential, Pareto/GPD, Laplace), to the symmetric (Normal, Laplace, Logistic, Student-t), to the asymmetric heavier tailed (Weibull, LogLogistic, GEV) distribution’s. While some of these formulas may exist elsewhere, we hope that this paper serves as a good resource for practitioners in search of superquantile formulas.

While the superquantile has risen in popularity over the past decade, a related characteristic called Buffered Probability of Exceedance (bPOE) has recently been introduced, first by Rockafellar and Royset (2010) in the context of Buffered Failure Probability and then generalized by Mafusalov and Uryasev (2018). This concept has grown in popularity within the risk management community with application in finance, logistics, analysis of natural disasters, statistics, stochastic programming, and machine learning (Shang et al. (2018); Uryasev (2014); Davis and Uryasev (2016); Mafusalov et al. (2018); Norton et al. (2017); Norton and Uryasev (2016). Specifically, bPOE is the inverse of the superquantile in the same way that the CDF is the inverse of the quantile. However, much like the superquantile when compared against the quantile, bPOE has many mathematically advantageous properties over the traditionally used Probability of Exceedance (POE). Direct optimization often reduces to convex or linear programming, it can be calculated via a one dimensional convex optimization problem, and it provides a risk-averse probabilistic assessment of the risk of experiencing outcomes larger than some fixed upper threshold. Thus, the second aim of this paper is to provide closed-form expressions for bPOE and, when unable to do so, show that calculation of bPOE is still simple, reducing to a one-dimensional convex optimization problem or a one-dimensional root finding problem. For the parametric portfolio application, in particular, we will see that when closed-form bPOE is unavailable and the superquantile is available, finding the optimal bPOE portfolio is no more difficult, computationally, than finding the optimal superquantile (CVaR) portfolio.

Motivating us to derive closed-form expressions (or simple calculation formulas) for the superquantile and bPOE for common distribution’s is the inclusion of these risk averse, tail measurements within parametric methods. In particular, we explore the use of the superquantile and bPOE within parametric portfolio optimization and density estimation. First, we consider parametric portfolio optimization, where returns are assumed to follow a specific distribution and, using these assumptions, a tractable portfolio optimization problem is formulated and solved. We begin by narrowing our choices of distribution to only those that both fit the pattern of portfolio returns and generate tractable portfolio optimization problems. Then, we consider two companion problems, solving for portfolio’s that minimize the superquantile (CVaR) of the distribution of potential losses (i.e. the average of the worst-case $100(1-\alpha)\%$ scenarios) and portfolio’s that minimize bPOE of the loss distribution (i.e. the buffered probability that losses will exceed a fixed upper threshold $x$ ). In comparing these problems, we discover that bPOE optimization can often be highly preferable to superquantile (CVaR) optimization in the parametric context. Specifically, for fixed $\alpha$ , the portfolio that minimizes the superquantile depends upon the distributional assumption (i.e., even if $\alpha$ is fixed, changing the assumed parametric distribution for returns will change the contents of the optimal portfolio). However, for fixed threshold $x$ , the portfolio that minimizes bPOE does not depend upon the distributional assumption (at least for the specific class of distribution’s we consider, which includes the Logistic, Laplace, Normal, Student-t, and GEV). In other words, no matter which of these distribution’s we choose, we will always achieve the same optimal portfolio for fixed value of threshold $x$ . Thus, bPOE-based portfolio optimization can provide additional consistency with respect to parameter choices, eliminating one source of additional variability for the decision maker.

Finally, we consider parametric density estimation, proposing a variant of MM where moments are replaced by superquantile’s. This can also be seen as a natural variation of the MOQ procedure where quantiles are replaced by superquantile’s. Made possible by the closed-form superquantile expressions, we show that this framework allows one to flexibly perform density estimation, allowing the user to focus the fitting procedure on specific portions of the distribution. For example, we illustrate by fitting a Weibull with additional emphasis put onto estimating the right tail. Compared against traditional MM and maximum likelihood (ML), we see that we get a better fit for such asymmetric, heavy tailed situations.

We first provide a brief introduction to superquantile’s and bPOE in Section 1.2. In Section 2, we give formulas for both the superquantile and bPOE for the Exponential, Pareto, Generalized Pareto, and Laplace distribution’s. Along the way, we highlight some simple relationships between POE, bPOE, the quantile, and the superquantile. In Section 3, we treat distribution’s for which a closed-form superquantile formula exists, but where we are unable to derive a simple closed-form bPOE formula. In order of appearance, we consider the Normal, LogNormal, Logistic, Generalized Student-t, Weibull, LogLogistic, and Generalized Extreme Value Distribution. However, we point out for these cases that because a formula for the superquantile is known, bPOE can be solved for via a simple root finding problem. We also illustrate for some cases that the one-dimensional convex optimization formula for bPOE can also be used in these cases. In Section 4, we illustrate the use of these formulas in portfolio optimization and parametric distribution approximation.

When working with optimization of tail probabilities, one frequently works with constraints or objectives involving probability of exceedance (POE), $p_{x}(X)=P(X>x)$ , or its associated quantile $q_{\alpha}(X)=\min\{x|P(X\leq x)\geq\alpha\}$ , where $\alpha\in[0,1]$ is a probability level. The quantile is a popular measure of tail risk in financial engineering, but when included in optimization problems via constraints or objectives, is quite difficult to treat with continuous (linear or non-linear) optimization techniques.

A significant advancement was made in Rockafellar and Uryasev (2000, 2002) in the development of a replacement called the superquantile or CVaR. The superquantile is a measure of uncertainty similar to the quantile, but with superior mathematical properties. Formally, the superquantile (CVaR) for a continuously distributed $X$ is defined as,

\bar{q}_{\alpha}(X)=E\left[X|X>q_{\alpha}(X)\right]=\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}xf(x)dx=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp.

Similar to $q_{\alpha}(X)$ , the superquantile can be used to assess the tail of the distribution. The superquantile, though, is far easier to handle in optimization contexts. It also has the important property that it considers the magnitude of events within the tail. Therefore, in situations where a distribution may have a heavy tail, the superquantile accounts for magnitudes of low-probability large-loss tail events while the quantile does not account for this information.

The notion of buffered probability was originally introduced by Rockafellar and Royset (2010) in the context of the design and optimization of structures as the Buffered Probability of Failure (bPOF). Working to extend this concept, bPOE was developed as the inverse of the superquantile by Mafusalov and Uryasev (2018) in the same way that POE is the inverse of the quantile. Specifically, for continuously distributed $X$ , bPOE at threshold $x$ is defined in the following way, where $\sup X$ denotes the essential supremum of random variable $X$ and threshold $x\in[E[X],\sup X]$ .

\bar{p}_{x}(X)=\{1-\alpha|\bar{q}_{\alpha}(X)=x\}\;.

In words, bPOE calculates one minus the probability level at which the superquantile, the tail expectation, equals the threshold $x$ . Roughly speaking, bPOE calculates the proportion of worst-case outcomes which average to $x$ . Figure 1 presents an illustration of bPOE for a Lognormal distributed random variable $X$ . We note that there exist two slightly different variants of bPOE, called Upper and Lower bPOE which are identical for continuous random variables. For this paper, we utilize only continuous random variables. For the interested reader, details regarding the difference between Upper and Lower bPOE can be found in Mafusalov and Uryasev (2018).

Refer to caption — Figure 1: Shown is the Probability Density Function (PDF) of $X\sim Lognormal(\sigma=1,\mu=0)$ . Given threshold $z\in\mathbb{R}$ , POE equals $P(X>z)$ the cumulative density in red. For the same threshold $z$ , bPOE equals $\bar{p}_{z}(X)$ the combined cumulative density in red and blue. By definition, the expectation of the worst-case $1-\alpha=\bar{p}_{z}(X)$ outcomes equals $z=\bar{q}_{\alpha}(X)$ . These worst-case outcomes are those that are larger than the quantile $q_{\alpha}(X)$ .

Similar to the superquantile, bPOE is a more robust measure of tail risk, as it considers not only the probability that events/losses will exceed the threshold $x$ , but also the magnitude of these potential events. Also, much like the superquantile, bPOE can be represented as the unique minimum of a one-dimensional convex optimization problem with the formulas given by Norton and Uryasev (2016); Mafusalov and Uryasev (2018) as follows, where $[\cdot]^{+}=\max\{\cdot,0\}$ .

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\min_{a\geq 0}E[a(X-x)+1]^{+}=\min_{\gamma<x}\frac{E[X-\gamma]^{+}}{x-\gamma},$
	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\underset{\gamma}{\text{min }}\gamma+\frac{E[X-\gamma]^{+}}{1-\alpha}\;.$

Note that these formulas are valid for general real valued random variables, not only continuously distributed random variables. It is also useful to note that the argmin of both the bPOE and superquantile optimization formulas gives the quantile. For the bPOE calculation formula, we have that the argmin is $\gamma^{*}=q_{\alpha}(X)$ where $\alpha=1-\bar{p}_{x}(X)$ and $a^{*}=\frac{1}{x-\gamma^{*}}$ for the other representation. For the superquantile calculation formula, we have that the argmin is $\gamma^{*}=q_{\alpha}(X)$ where $\alpha$ was the desired probability level for calculating the superquantile.

The bPOE concept is also closely related to the concept of a superdistribution function $\bar{F}(x)$ , introduced by Rockafellar and Royset (2014). For the CDF, we have that POE equals $P(X>x)=1-F(x)$ and we have the inverse CDF given by $F^{-1}(\alpha)=q_{\alpha}(X).$ The superdistribution function $\bar{F}(x)$ is motivated by the inverse relation $\bar{F}^{-1}(\alpha)=\bar{q}_{\alpha}(X).$ Thus, bPOE equals $1-\bar{F}(x)$ . The superdistribution function of a random variable $X$ can also be understood as the CDF of an auxiliary random variable $\bar{X}=\bar{q}_{u}(X)$ where $u\sim U(0,1)$ is a uniformly distributed random variable. In this case, $\bar{F}_{X}(x)=F_{\bar{X}}(x)$ where the subscript indicates that it is the distribution function associated with a particular random variable.

In this section, we derive closed-form expressions for both the superquantile and bPOE for the Exponential, Pareto, Generalized Pareto, and Laplace distribution’s. For these distribution’s, we see that they exhibit a reproducing type of property, where the formula for POE is identical to bPOE up to a constant. The Laplace distribution presents an interesting case in which only the right tail exhibits this reproducing property. Along the way, for completeness, we also highlight relationships between the expressions for bPOE, POE, the superquantile, and the quantile.

For this section, we have exponential random variable $X\sim Exp(\lambda)$ . Recall that the Exponential parameter has range $\lambda>0$ with $E[X]=\sigma(X)=\frac{1}{\lambda}$ , and that the Exponential CDF, PDF, and quantile are given by,

F(x)=\begin{cases}1-e^{-\lambda x}&x\geq 0,\\ 0&x<0.\end{cases},\quad f(x)=\begin{cases}\lambda e^{-\lambda x}&x\geq 0,\\ 0&x<0.\end{cases},\quad q_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda}

Let $X\sim Exp(\lambda)$ . Then,

.\bar{q}_{\alpha}(X)=\frac{-\ln(1-\alpha)+1}{\lambda},\quad\bar{p}_{x}(X)=e^{1-\lambda x}.

First, note that $q_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda}$ for exponential RV’s with rate parameter $\lambda$ . We then have,

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{\alpha}^{1}\ln(1-p)dp$
		$\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{1-\alpha}^{0}-\ln(y)dy=\frac{-1}{\lambda(1-\alpha)}\int_{0}^{1-\alpha}\ln(y)dy$

Since $\int\ln(y)dy=y\ln(y)-y+C$ , we have,

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{-1}{\lambda(1-\alpha)}\int_{0}^{1-\alpha}\ln(y)dy$
		$\displaystyle=\frac{-1}{\lambda(1-\alpha)}\left[(1-\alpha)\ln(1-\alpha)-(1-\alpha)\right]=\frac{-\ln(1-\alpha)+1}{\lambda}$

We can then see that,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\{1-\alpha\|\bar{q}_{\alpha}(X)=x\}$
		$\displaystyle=\{1-\alpha\|\frac{-\ln(1-\alpha)+1}{\lambda}=x\}$
		$\displaystyle=\{1-\alpha\|\ln(1-\alpha)=1-\lambda x\}$
		$\displaystyle=\{1-\alpha\|e^{\ln(1-\alpha)}=e^{1-\lambda x}\}=\{1-\alpha\|1-\alpha=e^{1-\lambda x}\}=e^{1-\lambda x}$

∎

Next, we relate bPOE and POE as well as the superquantile and quantile.

Let $X\sim Exp(\lambda)$ , with mean $\mu=\frac{1}{\lambda}$ . Then, $\bar{p}_{x}(X)=P(X>x-\mu)$ and $\bar{q}_{\alpha}(X)=q_{\alpha}(X)+\mu$ .

We know that $X$ , being exponential, has CDF given by $P(X\geq x)=1-e^{-\lambda x}$ . From Proposition 1, we know that

\bar{p}_{x}(X)=e^{(1-\lambda x)}=e^{-\lambda(\frac{-1}{\lambda}+x)}\;.

Then, since $\mu=\frac{1}{\lambda}$ , it follows that $\bar{p}_{x}(X)=e^{-\lambda(x-\mu)}=1-P(X\leq x-\mu)=P(X>x-\mu)$ . The equality for CVaR follows easily from Proposition 1 since $q_{\alpha}(X)=\frac{-\ln(1-\alpha)}{\lambda}$ . ∎

Assume $X\sim Pareto(a,x_{m})$ . Recall that Pareto parameters have range $a>0,x_{m}>0$ with $E[X]=\begin{cases}\infty,&a\leq 1,\\ \frac{ax_{m}}{a-1},&a>1\end{cases}$ and $\sigma^{2}(X)=\begin{cases}\infty,&a\in(0,2],\\ \frac{ax_{m}^{2}}{(a-1)^{2}(a-2)},&a>2\end{cases}$ , and that the Pareto CDF, PDF, and quantile are given by,

F(x)=\begin{cases}1-\left(\frac{x_{m}}{x}\right)^{a}&x\geq x_{m},\\ 0&x<x_{m}.\end{cases}\quad,\;f(x)=\begin{cases}\frac{ax_{m}^{a}}{x^{a+1}}&x\geq x_{m},\\ 0&x<x_{m}.\end{cases}\quad,\;q_{\alpha}(X)=\frac{x_{m}}{(1-\alpha)^{\frac{1}{a}}}

Assume $X\sim Pareto(a,x_{m})$ with $a>1$ . Then, for $\alpha\in[0,1]$ and $x\geq E[X]$ ,

\bar{q}_{\alpha}(X)=\frac{x_{m}a}{(1-\alpha)^{\frac{1}{a}}(a-1)}\;,\quad\bar{p}_{x}(X)=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}\;.

Note that if $a\in(0,1]$ , then $E[X]=\infty$ implying that $\bar{q}_{\alpha}(X)=\infty$ and $\bar{p}_{x}(X)=1$ for all $\alpha\in[0,1]$ and $x\in\mathbb{R}^{n}$ .

First, note that the conditional distribution of a pareto, conditioned on the event that the random value is larger than some $\gamma$ , is simply another pareto with parameters $a,\gamma$ . This implies that $E[X|X>\gamma]=\frac{a\gamma}{a-1}$ if $a\geq 1$ otherwise the expectation is $\infty$ . Also, $1-F(\gamma)=\left(\frac{x_{m}}{\gamma}\right)^{a}$ . Since,

E[X-\gamma]^{+}=(E[X|X>\gamma]-\gamma)(1-F(\gamma))\;,

we will have that,

E[X-\gamma]^{+}=(\frac{a\gamma}{a-1}-\gamma)\left(\frac{x_{m}}{\gamma}\right)^{a}\;.

This gives us bPOE formula,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\min_{x_{m}\leq\gamma<x}\frac{\left(\frac{a\gamma}{a-1}-\gamma\right)x_{m}^{a}}{\gamma^{a}(x-\gamma)}$
		$\displaystyle=\min_{x_{m}\leq\gamma<x}\frac{\left(\frac{a}{a-1}-1\right)x_{m}^{a}}{\gamma^{a-1}(x-\gamma)}=\left(\max_{x_{m}\leq\gamma<x}\frac{\gamma^{a-1}(x-\gamma)(a-1)}{x_{m}^{a}}\right)^{-1}$

Since $a>1$ , the maximization objective is concave over the range $\gamma\in(0,\infty)$ which contains the range $(x_{m},x)$ , so we just need to take the gradient of function $g(\gamma)=\frac{\gamma^{a-1}(x-\gamma)(a-1)}{x_{m}^{a}}$ and set it to zero to find the optimal $\gamma$ as follows:

	$\displaystyle\frac{\partial g}{\partial\gamma}=\frac{x(a-1)^{2}\gamma^{a-2}-(a-1)a\gamma^{a-1}}{x_{m}^{a}}=0$	$\displaystyle\implies x(a-1)^{2}\gamma^{a-2}=(a-1)a\gamma^{a-1}$
		$\displaystyle\implies\frac{x(a-1)}{a}=\gamma$

Plugging this value of $\gamma$ into the objective of our bPOE formula yields,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\frac{\left(\frac{\frac{ax(a-1)}{a}}{a-1}-\frac{x(a-1)}{a}\right)x_{m}^{a}}{\left(\frac{x(a-1)}{a}\right)^{a}\left(x-\frac{x(a-1)}{a}\right)}$
		$\displaystyle=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}$

CVaR is then equal to the value of $x$ which solves the equation $1-\alpha=\bar{p}_{x}(X)$ or,

1-\alpha=\left(\frac{x_{m}a}{x(a-1)}\right)^{a}\;,

which has solution,

\bar{q}_{\alpha}(X)=\frac{x_{m}a}{(1-\alpha)^{\frac{1}{a}}(a-1)}\;.

∎

Relating bPOE and POE, as well as the quantile and superquantile, we can say that,

\bar{p}_{x}(X)=P(X>\frac{x(a-1)}{a})=P(X>x)\left(\frac{a}{a-1}\right)^{a}\\ \text{ and }\bar{q}_{\alpha}(X)=q_{\alpha}(X)\frac{a}{a-1}\;.

Follows from Proposition 1 and the known formulas for POE and the quantile. ∎

Assume $X\sim GPD(\mu,s,\xi)$ . Recall that GPD parameters have range $\mu\in\mathbb{R},s>0,\xi\in\mathbb{R}$ with $E[X]=\mu+\frac{s}{1-\xi}$ if $\xi<1$ and $\sigma^{2}(X)=\frac{s^{2}}{(1-\xi)^{2}(1-2\xi)}$ if $\xi<.5$ , and that the GPD CDF and PDF are given by,

F(x)={\begin{cases}1-\left(1+{\frac{\xi(x-\mu)}{s}}\right)^{{-1/\xi}}&{\text{for }}\xi\neq 0,\\ 1-\exp\left(-{\frac{x-\mu}{s}}\right)&{\text{for }}\xi=0.\end{cases}}\quad,\;f(x)={\frac{1}{s}}\left(1+{\frac{\xi(x-\mu)}{s}}\right)^{{\left(-{\frac{1}{\xi}}-1\right)}}\;.

for $x\geq\mu$ when $\xi\geq 0$ and $\mu\leq x\leq\mu-\frac{s}{\xi}$ when $\xi<0$ . Furthermore, the quantiles are given by,

q_{\alpha}(X)=\begin{cases}\mu+\frac{s\left((1-\alpha)^{-\xi}-1\right)}{\xi}&{\text{for }}\xi\neq 0,\\ \mu-s\ln(1-\alpha)&{\text{for }}\xi=0.\end{cases}\;

Assume $X\sim GPD(\mu,s,\xi)$ with $-1<\xi<1$ . Then,

\bar{q}_{\alpha}(X)=\begin{cases}\mu+s\left[\frac{(1-\alpha)^{-\xi}}{1-\xi}+\frac{(1-\alpha)^{-\xi}-1}{\xi}\right]&{\text{for }}\xi\neq 0,\\ \mu+s[1-\ln(1-\alpha)]&{\text{for }}\xi=0.\end{cases}\;,

\bar{p}_{x}(X)=\begin{cases}\frac{\left(1+\frac{\xi(x-\mu)}{s}\right)^{-\frac{1}{\xi}}}{(1-\xi)^{\frac{1}{\xi}}}&{\text{for }}\xi\neq 0,\\ \ e^{1-\left(\frac{x-\mu}{s}\right)}&{\text{for }}\xi=0.\end{cases}\;.

For these results, we rely on the fact that if $X\sim GPD(\mu,s,\xi)$ , then $X-\gamma|X>\gamma\sim GPD(0,s+\xi(\gamma-\mu),\xi)$ , meaning that the excess distribution of a GPD random variable is also GPD. Now, note also that if $\xi<1$ , then $E[X]=\mu+\frac{s}{1-\xi}$ . This gives us,

E[X-\gamma|X>\gamma]=E[GPD(0,s+\xi(\gamma-\mu),\xi)]=\frac{s+\xi(\gamma-\mu)}{1-\xi}

which further implies that,

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=E[X-q_{\alpha}(X)\|X>q_{\alpha}(X)]+q_{\alpha}(X)$
		$\displaystyle=\frac{s+\xi(q_{\alpha}(X)-\mu)}{1-\xi}+q_{\alpha}(X)\;.$

Plugging in the values of the quantile functions yields the final formulas. Using the formulas we just found for $\bar{q}_{\alpha}(X)$ , it is an elementary exercise to solve for $\bar{p}_{x}(X)$ which equals $1-\alpha$ such that $\alpha$ solves the equation $x=\bar{q}_{\alpha}(X)$ . ∎

Assume $X\sim Laplace(\mu,b)$ . Recall that Laplace parameters have range $\mu\in\mathbb{R}$ , $b>0$ with $E[X]=\mu$ and $\sigma^{2}(X)=2b^{2}$ , and that the Laplace CDF, PDF, and quantile function are given by,

F(x)=\begin{cases}1-\frac{1}{2}e^{-\frac{x-\mu}{b}}&x\geq\mu,\\ \frac{1}{2}e^{\frac{x-\mu}{b}}&x<\mu.\end{cases}\quad,\;f(x)=\frac{1}{2b}e^{\frac{-|x-\mu|}{b}}\quad,\;

q_{\alpha}(X)=\mu-b\,sign(\alpha-0.5)\,\ln(1-2|\alpha-0.5|)\;.

If $X\sim Laplace(\mu,b)$ , then

\bar{q}_{\alpha}(X)=\begin{cases}\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))&\alpha<.5,\\ \mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)&\alpha\geq.5.\end{cases}\;,

\bar{p}_{x}(X)=\begin{cases}\frac{1}{2}e^{1-\left(\frac{x-\mu}{b}\right)}&x\geq\mu+b,\\ 1+\frac{z}{\mathcal{W}(-2e^{-z-1}z)}&x<\mu+b.\end{cases}

where $z=\frac{x-\mu}{b}$ and $\mathcal{W}$ is the Lambert- $\mathcal{W}$ function.³³3Also called the product logarithm or omega function.

To get the superquantile, we begin with the integral representation:

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu-b\,sign(p-0.5)\,\ln(1-2\|p-0.5\|)dp$
		$\displaystyle=\mu-\frac{b}{1-\alpha}\int_{\alpha}^{1}sign(p-0.5)\,\ln(1-2\|p-0.5\|)dp$
		$\displaystyle=\mu-\frac{b}{1-\alpha}\left(\int_{\min\{\alpha,.5\}}^{.5}-\ln(2p)dp+\int_{\max\{\alpha,.5\}}^{1}\ln(2(1-p))dp\right)\;.$

To evaluate the integral, we utilize simple substitution as well as the identity $\int\ln(y)dy=y\ln(y)-y+C$ . After simplifying, we see that with $\alpha<.5$ the integral evaluates to,

\bar{q}_{\alpha}(X)=\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))\;.

Similarly, we find that with $\alpha\geq.5$ the integral evaluates to,

\bar{q}_{\alpha}(X)=\mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)\;.

For bPOE, first assume that threshold $x\geq\mu+b$ . Using our formula for CVaR, we see that $\bar{q}_{.5}(X)=\mu+b$ . Thus, $x\geq\mu+b$ implies that $1-\bar{p}_{x}(X)\geq.5$ implying that,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\{1-\alpha\|\bar{q}_{\alpha}(X)=x,\alpha\geq.5\}$
		$\displaystyle=\{1-\alpha\|\mu+b\left(1-\ln\left(2(1-\alpha)\right)\right)=x\}$
		$\displaystyle=\frac{1}{2}e^{1-\left(\frac{x-\mu}{b}\right)}\;.$

Assume contrarily that $x<\mu+b$ . Since $\bar{q}_{.5}(X)=\mu+b$ , we have that $1-\bar{p}_{x}(X)<.5$ which implies that,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\{1-\alpha\|\bar{q}_{\alpha}(X)=x,\alpha<.5\}$
		$\displaystyle=\{1-\alpha\|\mu+b\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=x\}\;.$

Letting $z=\frac{x-\mu}{b}$ , we must now find $\alpha$ which solves the equation $\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=z$ . We do so, via the following:

	$\displaystyle\left(\frac{\alpha}{1-\alpha}\right)(1-\ln(2\alpha))=z$	$\displaystyle\implies\frac{-z}{\alpha}=\frac{(\ln(2\alpha)-1)}{1-\alpha}$
		$\displaystyle\implies e^{\frac{-z}{\alpha}}=e^{\frac{(\ln(2\alpha)-1)}{1-\alpha}}=\left(\frac{2\alpha}{e}\right)^{\frac{1}{1-\alpha}}$
		$\displaystyle\implies e^{\frac{-z(1-\alpha)}{\alpha}}=\left(\frac{2\alpha}{e}\right)$
		$\displaystyle\implies\frac{-z}{\alpha}e^{-z(\frac{1}{\alpha}-1)}=-2ze^{-1}$
		$\displaystyle\implies\frac{-z}{\alpha}e^{\frac{-z}{\alpha}}=-2ze^{-z-1}$
		$\displaystyle\implies\frac{-z}{\alpha}=\mathcal{W}(-2ze^{-z-1})\;.$

where the final step follows from the definition of the Lambert- $\mathcal{W}$ function which is given by the relation $xe^{x}=y\iff\mathcal{W}(y)=x$ . Thus, $\frac{-z}{\alpha}=\mathcal{W}(-2ze^{-z-1})\implies\bar{p}_{x}(X)=1-\alpha=1+\frac{z}{\mathcal{W}(-2e^{-z-1}z)}$ . ∎

In this section, we derive closed-form expressions for the superquantile of the Normal, LogNormal, Logistic, Student-t, Weibull, LogLogistic, and GEV distribution’s. The Normal, Logistic, and Student-t provide us with examples of symmetric distribution’s with varying tail heaviness. The LogNormal, Weibull, LogLogistic, and GEV provide us with examples of asymmetric distribution’s that have heavy right tails. In particular, we will utilize the Weibull formula for density estimation in Section 5.

For these distribution’s, we are not able to reduce calculation of bPOE to closed-form. However, we highlight for the case of the Normal and Logistic that bPOE can be calculated by solving a one-dimensional convex optimization problem or one-dimensional root finding problem. In general, we note that for continuous $X$ , bPOE at $x$ equals $1-\alpha$ where $\alpha$ solves $\bar{q}_{\alpha}(X)=x$ . Thus, if the superquantile is known in closed-form, this reduces to a simple one-dimensional root finding problem in $\alpha$ .

Let $X\sim\mathcal{N}(0,1)$ be a standard normal random variable. Recall that

F(x)=\frac{1}{2}\left[1+\text{erf}\left(\frac{x}{\sqrt{2}}\right)\right],\quad f(x)=\frac{1}{\sqrt{2\pi}}e^{\frac{-x^{2}}{2}},\quad q_{\alpha}(X)=\sqrt{2}\text{erf}^{-1}(2\alpha-1)\;,

where $\text{erf}(\cdot)$ is the commonly known error function with $\text{erf}^{-1}(\cdot)$ denoting its inverse.

We show that the superquantile can be calculated by utilizing the quantile function and PDF, which is a well known result (see e.g., Rockafellar and Uryasev (2000)). We also show that bPOE can be calculated in two ways: by solving a simple root finding problem involving only the PDF and CDF or by solving a convex optimization problem with gradients calculated via the commonly used error function. Some results are presented only for the Standard Normal $\mathcal{N}(0,1)$ , but can easily be applied to the non-standard case $\mathcal{N}(\mu,\sigma)$ with appropriate shifting and scaling.

If $X\sim\mathcal{N}(\mu,\sigma)$ , then

\bar{q}_{\alpha}(X)=\mu+\sigma\frac{f(q_{\alpha}(\frac{X-\mu}{\sigma}))}{1-\alpha}\;.

It is well known that if $X\sim\mathcal{N}(0,1)$ , then the conditional expectation is given by the inverse Mills Ratio, $E[X|X>\gamma]=\frac{f(\gamma)}{1-F(\gamma)}$ . It follows then that $\bar{q}_{\alpha}(X)=E[X|X>q_{\alpha}(X)]=\frac{f(q_{\alpha}(X))}{1-F(q_{\alpha}(X))}=\frac{f(q_{\alpha}(X))}{1-\alpha}$ . ∎

If $X\sim\mathcal{N}(0,1)$ , then

\bar{p}_{x}(X)=\min_{\gamma<x}\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}\;.

Furthermore, if $\gamma\in argmin$ , then $\gamma$ equals the quantile of $X$ at probability level $1-\bar{p}_{x}(X)$ .

Note that for a standard normal random variable, the tail expectation beyond any threshold $\gamma$ is given by the inverse Mills Ratio,

E[X|X>\gamma]=\frac{f(\gamma)}{1-F(\gamma)}\;.

Note also that for any threshold $\gamma$ and any random variable we have,

E[X-\gamma]^{+}=(E[X|X>\gamma]-\gamma)(1-F(\gamma))\;.

Using the mills ratio gives us,

E[X-\gamma]^{+}=(\frac{f(\gamma)}{1-F(\gamma)}-\gamma)(1-F(\gamma))=f(\gamma)-\gamma(1-F(\gamma))\;.

Plugging this result into the minimization formula for bPOE yields the final formula. ∎

Let $X\sim\mathcal{N}(0,1)$ with $x\in\mathbb{R}$ given. If $\gamma$ is the solution to the equation

\frac{f(\gamma)}{1-F(\gamma)}=x\;

then $\bar{p}_{x}(X)=\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}$ . Additionally, we will have that $q_{\alpha}(X)=\gamma$ and $\bar{q}_{\alpha}(X)=x$ at probability level $\alpha=1-\bar{p}_{x}(X)$ .

This follows from the fact that $\bar{q}_{\alpha}(X)=E[X|X>q_{\alpha}(X)]=\frac{f(q_{\alpha}(X))}{1-F(q_{\alpha}(X))}$ and the optimization formula of bPOE given in the previous proposition for normally distributed variables. ∎

The following proposition provides the gradient calculation for solving the bPOE minimization problem.

For $X\sim\mathcal{N}(0,1)$ , we have that the bPOE minimization formula has the following integral representation,

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\min_{\gamma<x}\frac{f(\gamma)-\gamma(1-F(\gamma))}{x-\gamma}$
		$\displaystyle=\min_{\gamma<x}\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}du$
		$\displaystyle=\min_{\gamma<x}\frac{e^{\frac{-\gamma^{2}}{2}}-\gamma\sqrt{\frac{\pi}{2}}\text{erfc}(\frac{\gamma}{\sqrt{2}})}{\sqrt{2\pi}(x-\gamma)}$

Furthermore, the function $g(u,\gamma;x)=\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}du$ is convex w.r.t. $\gamma$ over the range $\gamma\in(-\infty,x)$ . Additionally, $g$ has gradient given by,

	$\displaystyle\frac{\partial g}{\partial\gamma}$	$\displaystyle=\frac{1}{\sqrt{2\pi}}\int_{0}^{\infty}\frac{\partial}{\partial\gamma}\left(\frac{ue^{\frac{-(\gamma+u)^{2}}{2}}}{x-\gamma}\right)du$
		$\displaystyle=\frac{e^{\frac{-\gamma^{2}}{2}}-x\sqrt{\frac{\pi}{2}}\text{erfc}(\frac{\gamma}{\sqrt{2}})}{\sqrt{2\pi}(x-\gamma)^{2}}$

where $\text{erfc}(\cdot)$ denotes the complementary error function.

To derive the integral representation, simply plug in the formula for $E[X-\gamma]^{+}$ , then utilize the definition of the PDF and CDF. The gradient calculation is a standard calculus exercise. ∎

Assume $X\sim LogNormal(\mu,s)$ . Recall that LogNormal parameters have range $\mu\in\mathbb{R}$ , $s>0$ , with $E[X]=e^{\mu+\frac{s^{2}}{2}}$ and $\sigma^{2}(X)=(e^{s^{2}}-1)e^{2\mu+s^{2}}$ and that the LogNormal CDF, PDF, and quantile function are given by,

F(x)=\frac{1}{2}\left[1+\text{erf}\left(\frac{\ln x-\mu}{s\sqrt{2}}\right)\right],\quad f(x)=\frac{1}{xs\sqrt{2\pi}}e^{\frac{-(\ln x-\mu)^{2}}{2s^{2}}},

\quad q_{\alpha}(X)=e^{\mu+s\sqrt{2}\text{erf}^{-1}(2\alpha-1)}\;.

If $X\sim LogNormal(\mu,s)$ , then

\bar{q}_{\alpha}(X)=\frac{1}{2}e^{\mu+\frac{s^{2}}{2}}\frac{\left[1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2\alpha-1)\right)\right]}{1-\alpha}.

We simply evaluate the integral of the quantile function as follows.

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}e^{\mu+s\sqrt{2}\text{erf}^{-1}(2p-1)}dp$
		$\displaystyle=\frac{e^{\mu}}{1-\alpha}\int_{\alpha}^{1}e^{s\sqrt{2}\text{erf}^{-1}(2p-1)}dp$
		$\displaystyle=\frac{e^{\mu}}{1-\alpha}\left[-\frac{1}{2}e^{\frac{s^{2}}{2}}\left(1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2p-1)\right)\right)\right]_{p=\alpha}^{1}$
		$\displaystyle=\frac{e^{\mu}}{1-\alpha}\left[\frac{1}{2}e^{\frac{s^{2}}{2}}+\frac{1}{2}e^{\frac{s^{2}}{2}}\left(1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2p-1)\right)\right)\right]$
		$\displaystyle=\frac{1}{2}e^{\mu+\frac{s^{2}}{2}}\frac{\left[1+\text{erf}\left(\frac{s}{\sqrt{2}}-\text{erf}^{-1}(2\alpha-1)\right)\right]}{1-\alpha}.$

∎

Assume $X\sim Logistic(\mu,s)$ . Recall that Logistic parameters have range $\mu\in\mathbb{R}$ , $s>0$ , with $E[X]=\mu$ and $\sigma^{2}(X)=\frac{s^{2}\pi^{2}}{3}$ and that the Logistic CDF, PDF, and quantile function are given by,

F(x)={\frac{1}{1+e^{{-{\frac{x-\mu}{s}}}}}}\;,\quad f(x)={\frac{e^{{-{\frac{x-\mu}{s}}}}}{s\left(1+e^{{-{\frac{x-\mu}{s}}}}\right)^{2}}}\;,\quad q_{\alpha}(X)=\mu+s\ln\left(\frac{\alpha}{1-\alpha}\right)\;.

Here, we derive a closed-form expression for the superquantile for the logistic distribution and derive a simple root finding problem for calculating bPOE. We also find that these quantities have a correspondence with the binary entropy function.

If $X\sim Logistic(\mu,s)$ , then

\bar{q}_{\alpha}(X)=\mu+\frac{sH(\alpha)}{1-\alpha}

where $H(\alpha)$ is the binary entropy function $H(\alpha)=-\alpha\ln(\alpha)-(1-\alpha)\ln(1-\alpha)$ . Furthermore, for any $x\geq\mu$ , if $\alpha$ solves the equation,

\frac{H(\alpha)}{1-\alpha}=\frac{x-\mu}{s}\;,

then $\bar{p}_{x}(X)=1-\alpha$ . Additionally, $\bar{p}_{x}(X)=1-\alpha$ if $\alpha$ is the solution to the transformed system,

(1-\alpha)\alpha^{\frac{\alpha}{1-\alpha}}=e^{-\left(\frac{x-\mu}{s}\right)}\;.

Note that both functions $\frac{H(\alpha)}{1-\alpha}$ and $(1-\alpha)\alpha^{\frac{\alpha}{1-\alpha}}$ are one-dimensional, convex, and monotonic over the range $\alpha\in[0,1]$ , and thus unique solutions exist and can easily be found via root finding methods.

To obtain the superquantile, we have

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu+s\ln\left(\frac{\alpha}{1-\alpha}\right)dp$
		$\displaystyle=\mu+\frac{s}{1-\alpha}\int_{\alpha}^{1}\ln(p)-\ln(1-p)dp$
		$\displaystyle=\mu+\frac{s}{1-\alpha}\left(\int_{\alpha}^{1}\ln(p)dp+\int_{\alpha}^{1}-\ln(1-p)dp\right)$

Utilizing simple substitution as well as the identity $\int\ln(y)dy=y\ln(y)-y+C$ , we get

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\mu+\frac{s}{1-\alpha}\left(-1-\alpha\ln\alpha+\alpha-(1-\alpha)\ln(1-\alpha)+(1-\alpha)\right)$
		$\displaystyle=\mu+\frac{s}{1-\alpha}\left(-\alpha\ln\alpha-(1-\alpha)\ln(1-\alpha)\right)$
		$\displaystyle=\mu+\frac{s}{1-\alpha}H(\alpha)\;.$

To get bPOE, we simply follow the bPOE definition, needing to find $\alpha$ which solves $\mu+\frac{s}{1-\alpha}H(\alpha)=x$ . The transformed system arises from combining logarithms within the superquantile formula and applying exponential transformations. ∎

We can also utilize the minimization formula to calculate bPOE. Calculating bPOE in this way has the added benefit of simultaneously calculating the quantile $q_{1-\bar{p}_{x}(X)}(X)$ .

If $X\sim Logistic(\mu,s)$ , then

\bar{p}_{x}(X)=\min_{\gamma<x}\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{x-\gamma}\;,

which is a convex optimization problem over the range $\gamma\in(-\infty,x)$ . Furthermore, the minimum occurs at $\gamma$ such that,

\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{x-\gamma}=1-F(\gamma)\;.

This follows from the fact that $E[X-\gamma]^{+}=\int_{\gamma}^{\infty}(1-F(t))dt$ . Evaluating this integral for $X\sim Logistic(\mu,s)$ yields, $E[X-\gamma]^{+}=s\ln(1+e^{-(\frac{\gamma-\mu}{s})})$ which can then be plugged into the minimization formula for bPOE. The second part of the proposition follows from the fact that the gradient of the objective function w.r.t. $\gamma$ is given by,

\frac{s\ln(1+e^{-(\frac{\gamma-\mu}{s})})}{(x-\gamma)^{2}}-\frac{e^{-(\frac{\gamma-\mu}{s})}}{(x-\gamma)\left(1+e^{-(\frac{\gamma-\mu}{s})}\right)}\;.

Setting this gradient to zero and simplifying yields the stated optimality condition. ∎

Assume $X\sim Student$ -t $(\nu,s,\mu)$ . Recall that Student $-t$ parameters have range $\nu>0$ , $s>0$ , $\mu>0$ with $E[X]=\mu$ and $\sigma^{2}(X)=\frac{s^{2}\nu}{\nu-2}$ , and that the Student $-t$ CDF and PDF are given by,

F(x)=1-\frac{1}{2}\mathcal{I}_{\nu(x)}\left(\frac{\nu}{2},\frac{1}{2}\right)\;,\quad f(x)=\frac{\Gamma(\frac{\nu+1}{2})}{\Gamma(\frac{\nu}{2})\sqrt{\nu\pi}s}\left(1+\frac{(x-\mu)^{2}}{\nu s^{2}}\right)^{\frac{-(\nu+1)}{2}}\;,

where $\nu(x)=\frac{\nu}{\frac{x-\mu}{s}+\nu}$ , $\mathcal{I}_{t}(a,b)$ is the regularized incomplete Beta function, and $\Gamma(a)$ is the Gamma function. Note that a general closed-form expression for $q_{\alpha}(X)$ is not known but is a readily available function within common software packages like EXCEL.

If $X\sim Student$ -t $(\nu,s,\mu)$ , then

\bar{q}_{\alpha}(X)=\mu+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)(1-\alpha)}\right)\tau(T^{-1}(\alpha))

where $T^{-1}(\alpha)$ is the inverse of the standardized Student $-t$ CDF and $\tau(x)$ is standardized Student $-t$ PDF.

Since there is no closed-form expression for the quantile, we utilize the representation of the superquantile given by $\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}tf(t)dt$ . To evaluate this integral, we first take the derivative of the PDF, giving

\frac{df(x)}{dx}=\frac{-f(x)(x-\mu)(\nu+1)}{\nu s^{2}+(x-\mu)}.

Rearranging yields,

xf(x)dx=\frac{-\nu s^{2}df(x)}{(\nu+1)}-\frac{(x-\mu)^{2}df(x)}{(\nu+1)}+\mu f(x)dx.

We can then integrate both sides,

\int xf(x)dx=\frac{-\nu s^{2}f(x)}{(\nu+1)}-\frac{1}{(\nu+1)}\int(x-\mu)^{2}df(x)+\mu F(x).

Integrating by parts, gives us the following form of the middle term,

\int(x-\mu)^{2}df(x)=(x-\mu)^{2}f(x)-2\int xf(x)dx+2\mu F(x)\;.

Then, finally, after substituting this new expression for the middle term and simplifying, we get

\int xf(x)dx=-\frac{(\nu s^{2}+(x-\mu)^{2})}{(\nu-1)}f(x)+\mu F(x).

Taking the definite integral yields,

	$\displaystyle\int_{q_{\alpha}(X)}^{\infty}xf(x)dx$	$\displaystyle=\left(-\lim_{x\rightarrow\infty}\frac{(\nu s^{2}+(x-\mu)^{2})}{(\nu-1)}f(x)+\lim_{x\rightarrow\infty}\mu F(x)\right)$
		$\displaystyle\qquad\qquad\quad-\left(-\frac{(\nu s^{2}+(q_{\alpha}(X)-\mu)^{2})}{(\nu-1)}f(q_{\alpha}(X))+\mu F(q_{\alpha}(X))\right).$

It is easy to see that the second limit goes to one and, after applying L’Hopital where necessary, that the first limit goes to zero. This leaves

	$\displaystyle\int_{q_{\alpha}(X)}^{\infty}xf(x)dx$	$\displaystyle=\mu-\left(-\frac{(\nu s^{2}+(q_{\alpha}(X)-\mu)^{2})}{(\nu-1)}f(q_{\alpha}(X))+\mu F(q_{\alpha}(X))\right)$
		$\displaystyle=\mu(1-\alpha)+\left(\frac{\nu s^{2}+(q_{\alpha}(X)-\mu)^{2}}{(\nu-1)}\right)f(q_{\alpha}(X))$
		$\displaystyle=\mu(1-\alpha)+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)}\right)\tau(T^{-1}(\alpha)),$

where the final step comes from writing the non-standardized quantile $q_{\alpha}(X)$ and PDF $f(x)$ in their standardized form. Then, finally, dividing by $1-\alpha)$ yields the formula,

\bar{q}_{\alpha}(X)=\frac{1}{1-\alpha}\int_{q_{\alpha}(X)}^{\infty}xf(x)dx=\mu+s\left(\frac{\nu+T^{-1}(\alpha)^{2}}{(\nu-1)(1-\alpha)}\right)\tau(T^{-1}(\alpha)).

∎

Assume $X\sim Weibull(\lambda,k)$ . Recall that Weibull parameters have range $\lambda>0$ , $k>0$ with $E[X]=\lambda\Gamma(1+\frac{1}{k})$ and $\sigma^{2}(X)=\lambda^{2}\left[\Gamma(1+\frac{2}{k})-\Gamma(1+\frac{1}{k})^{2}\right]$ , and that the Weibull CDF, PDF, and quantile function are given by,

F(x)=1-e^{-(x/\lambda)^{k}}\;,\quad f(x)=\begin{cases}\frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}}&x\geq 0,\\ 0&x<0,\end{cases}\;,

\quad q_{\alpha}(X)=\lambda{(-\ln(1-\alpha))}^{1/k}\;.

where $\Gamma(a)=\int_{0}^{\infty}p^{a-1}e^{-p}dp$ is the gamma function.

If $X\sim Weibull(\lambda,k)$ , then

\bar{q}_{\alpha}(X)=\frac{\lambda}{1-\alpha}\Gamma_{U}\left(1+\frac{1}{k},-\ln(1-\alpha)\right)

where $\Gamma_{U}(a,b)=\int_{b}^{\infty}p^{a-1}e^{-p}dp$ is the upper incomplete gamma function.

To calculate the superquantile, we utilize the integral representation, which is

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\lambda{(-\ln(1-p))}^{1/k}dp\;.$

To put this integral into the form of the upper incomplete gamma function, make the change of variable $y=-\ln(1-p)$ . This gives $e^{y}=\frac{1}{1-p}$ and $dp=(1-p)dy=e^{-y}dp$ with new lower limit of integration $\alpha\rightarrow-\ln(1-\alpha)$ and upper limit of integration $1\rightarrow\infty$ . Applying to the integral, yields

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{\lambda}{1-\alpha}\int_{-\ln(1-\alpha)}^{\infty}{y}^{1/k}e^{-y}dy$
		$\displaystyle=\frac{\lambda}{1-\alpha}\Gamma_{U}\left(1+\frac{1}{k},-\ln(1-\alpha)\right)\;.$

∎

Assume $X\sim LogLogistic(a,b)$ . Recall that Log-Logistic parameters have range $a>0$ , $b>0$ with $E[X]=a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)$ when $b>1$ and
$\sigma^{2}(X)=a^{2}\left(\frac{2\pi}{b}csc\left(\frac{2\pi}{b}\right)-(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right))^{2}\right)$ when $b>2$ , and that the Log-Logistic CDF, PDF, and quantile function are given by,

F(x)=\frac{1}{1+\left(\frac{x}{a}\right)^{-b}}\;,\quad f(x)={\frac{(b/a)(x/a)^{{b-1}}}{\left(1+(x/a)^{{b}}\right)^{2}}}\;,\quad q_{\alpha}(X)=a\left(\frac{\alpha}{1-\alpha}\right)^{\frac{1}{b}}\;,

where $csc(\cdot)$ is the cosecant function.

If $X\sim LogLogistic(a,b)$ , then

\bar{q}_{\alpha}(X)=\frac{a}{1-\alpha}\left(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-B_{\alpha}\left(\frac{1}{b}+1,1-\frac{1}{b}\right)\right)

where $B_{y}(A_{1},A_{2})=\int_{0}^{y}p^{A_{1}-1}(1-p)^{A_{2}-1}dp$ is the incomplete beta function.

To calculate the superquantile, we utilize the integral representation as follows:

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}q_{p}(X)dp$
		$\displaystyle=\frac{1}{1-\alpha}\left(\int_{0}^{1}q_{p}(X)dp-\int_{0}^{\alpha}q_{p}(X)dp\right)$
		$\displaystyle=\frac{1}{1-\alpha}\left(E[X]-\int_{0}^{\alpha}q_{p}(X)dp\right)$
		$\displaystyle=\frac{1}{1-\alpha}\left(E[X]-a\int_{0}^{\alpha}\left(\frac{p}{1-p}\right)^{\frac{1}{b}}dp\right)\;.$

Now, note first that for $X\sim LogLogistic(a,b)$ , we have $E[X]=a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)$ . Next, for the incomplete beta function, letting $A_{1}=\frac{1}{b}+1$ and $A_{2}=1-\frac{1}{b}$ , we can see that

B_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})=\int_{0}^{\alpha}p^{\frac{1}{b}}\,(1-p)^{-\frac{1}{b}}\,dp\;.

Using these two facts, we have,

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\left(E[X]-a\int_{0}^{\alpha}\left(\frac{p}{1-p}\right)^{\frac{1}{b}}dp\right)$
		$\displaystyle=\frac{1}{1-\alpha}\left(a\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-aB_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})\right)$
		$\displaystyle=\frac{a}{1-\alpha}\left(\frac{\pi}{b}csc\left(\frac{\pi}{b}\right)-B_{\alpha}(\frac{1}{b}+1,1-\frac{1}{b})\right)\;.$

∎

Assume $X$ follows a Generalized Extreme Value (GEV) Distribution, which we denote as $X\sim GEV(\mu,s,\xi)$ . Recall that GEV parameters have range $\mu\in\mathbb{R}$ , $s>0$ , $\xi\in\mathbb{R}$ with $E[X]=\begin{cases}\mu+s(g_{1}-1)/\xi&{\text{if}}\ \xi\neq 0,\xi<1,\\ \mu+s\,y&{\text{if}}\ \xi=0,\\ \infty&{\text{if}}\ \xi\geq 1,\end{cases}$ and
$\sigma^{2}(X)=\begin{cases}s^{2}\,(g_{2}-g_{1}^{2})/\xi^{2}&\text{if}\ \xi\neq 0,\xi<\frac{1}{2},\\ s^{2}\,\frac{\pi^{2}}{6}&\text{if}\ \xi=0,\\ \infty&\text{if}\ \xi\geq\frac{1}{2},\end{cases}$ where $g_{k}=\Gamma(1-k\xi)$ and $y$ is the Euler-Mascheroni constant.

Additionally, recall that the GEV has CDF, PDF, and quantile function given by,

F(x)=\begin{cases}e^{-\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}}}&\xi\neq 0,\\ e^{-e^{-\left(\frac{x-\mu}{s}\right)}}&\xi=0\end{cases}\;,

\quad f(x)=\begin{cases}\frac{1}{s}\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}-1}e^{-\left(1+\frac{\xi(x-\mu)}{s}\right)^{\frac{-1}{\xi}}}&\xi\neq 0,\\ \frac{1}{s}e^{-\left(\frac{x-\mu}{s}\right)}e^{-e^{-\left(\frac{x-\mu}{s}\right)}}&\xi=0,\end{cases}\;

\quad q_{\alpha}(X)=\begin{cases}\mu+\frac{s}{\xi}\left((\ln(\frac{1}{\alpha})^{-\xi}-1\right)&\xi\neq 0,\\ \mu-s\ln(-\ln(\alpha))&\xi=0.\end{cases}\;

If $X\sim GEV(\mu,s,\xi)$ , then

\bar{q}_{\alpha}(X)=\begin{cases}\mu+\frac{s}{\xi(1-\alpha)}\left[\Gamma_{L}(1-\xi,\ln(\frac{1}{\alpha}))-(1-\alpha)\right]&\xi\neq 0,\\ \mu+\frac{s}{(1-\alpha)}\left(y+\alpha\ln(-\ln(\alpha))-li(\alpha)\right)&\xi=0,\end{cases}

where $\Gamma_{L}(a,b)=\int_{0}^{b}p^{a-1}e^{-p}dp$ is the lower incomplete gamma function, $li(x)=\int_{0}^{\alpha}\frac{1}{\ln p}dp$ is the logarithmic integral function, and $y$ is the Euler-Mascheroni constant.

Assume we have $\xi=0$ . Then, we have

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu-s\ln(-\ln(p))dp$
		$\displaystyle=\mu-\frac{s}{1-\alpha}\left(\int_{0}^{1}\ln(-\ln(p))dp-\int_{0}^{\alpha}\ln(-\ln(p))dp\right)$
		$\displaystyle=\mu-\frac{s}{1-\alpha}\left(y-\int_{0}^{\alpha}\ln(-\ln(p))dp\right)$
		$\displaystyle=\mu-\frac{s}{1-\alpha}\left(y+\alpha\ln(-\ln(\alpha))-li(\alpha)\right)$

Assume now that $\xi\neq 0$ . Then, we have that,

	$\displaystyle\bar{q}_{\alpha}(X)$	$\displaystyle=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mu+\frac{s}{\xi}\left((\ln(\frac{1}{p})^{-\xi}-1\right)dp$
		$\displaystyle=\mu-\frac{s}{\xi(1-\alpha)}\int_{\alpha}^{1}\left((\ln(\frac{1}{p})^{-\xi}-1\right)dp$
		$\displaystyle=\mu+\frac{s}{\xi(1-\alpha)}\left[\Gamma_{L}(1-\xi,\ln(\frac{1}{\alpha}))-(1-\alpha)\right]$

∎

A common parametric approach to portfolio optimization is to assume that portfolio returns follow some specified distribution. In this context, particularly when taking a risk averse approach, closed-form representations of the superquantile and bPOE for the specified distribution allow one to formulate a tractable portfolio optimization problem. In this section, we show that our derived formulas for the superquantile and bPOE reveal important properties about portfolio optimization problems formulated with particular distributional assumptions placed upon portfolio returns.

Portfolio optimization with the superquantile is common, so we begin by simply pointing out which of the closed-form superquantile formulas yield tractable portfolio optimization problems. Portfolio optimization with bPOE, however, is not common and we show that it can be advantageous compared to the superquantile approach. In particular, superquantile optimization requires that one sets the probability level $\alpha$ . One can then observe that for fixed $\alpha$ , the optimal superquantile portfolio may change based upon the distribution utilized to model returns. We show that if portfolio returns are assumed to follow a Laplace, Logistic, Normal, or Student- $t$ distribution, the minimal bPOE portfolio’s for fixed threshold $x$ are the same regardless of the distribution chosen, meaning that there exists a single portfolio that is $x$ -bPOE optimal for multiple choices of distribution.

Note that in this section we will be dealing with asset returns $R$ , as it is typical for financial problems, and the loss is the opposite of return: $X=-R$ , and $q_{\alpha}(X)=-q_{1-\alpha}(R)$ .

The portfolio optimization problem consists of finding a vector of asset weights $w\in\mathbb{R}^{n}$ for a set of $n$ assets with unknown random returns $R=[R_{1},R_{2},...,R_{n}]$ that solves the following optimization problem,

where $L(w,R)$ is some function to be maximized,⁴⁴4Or minimized if we consider the negative. functions $g_{j}(w,R)$ and $h_{i}(w,R)$ enforce inequality and equality constraints respectively, and vectors $l,u$ enforce upper and lower bounds on the individual asset weights. A simple example is the standard Markowitz optimization problem where we maximize the expected utility, which is a weighted combination of the expected return and its variance via a positive trade-off parameter $\lambda\geq 0$ :

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }}$	$\displaystyle w^{T}\eta-\lambda w^{T}\Sigma w$	(2)
$\displaystyle s.t.$	$\displaystyle w^{T}1=1$
$\displaystyle l\leq w\leq u$

An important aspect of the random portfolio return $w^{T}R$ which can be seen within the Markowitz problem and will be used later on in this section is the fact that the expectation $E[w^{T}R]$ and variance $\sigma^{2}(w^{T}R)$ are given by $w^{T}\eta$ and $w^{T}\Sigma w$ respectively, where $\eta\in\mathbb{R}^{n}$ is the vector of expected returns for the $n$ assets and $\Sigma\in\mathbb{R}^{n\times n}$ is the covariance matrix for the $n$ assets. This allows us to represent the expected value and variance of the portfolio return in terms of $w$ , and consequently to formulate an optimization problem with decision vector $w$ .

As we are dealing with asset returns, and not losses, we need to define the superquantile using that notation. The superquantile is the expected loss above the quantile (conditional expected value of losses in the right tail), so in terms of returns it would be the conditional expected value of returns in the left tail, which can be described by the left superquantile:

-\tilde{q}_{1-\alpha}(R)=\frac{1}{1-\alpha}\int_{0}^{1-\alpha}q_{p}(R)dp.

We can use the closed-form superquantile formulas derived in the previous sections for the right superquantile $\bar{q}_{\alpha}(R)$ to calculate the left superquantile $\tilde{q}_{\alpha}(R)$ , as

\alpha\tilde{q}_{\alpha}(R)+(1-\alpha)\bar{q}_{\alpha}(R)=\int_{0}^{1}q_{p}(R)dp=E[R],

-\tilde{q}_{1-\alpha}(R)=-\frac{1}{1-\alpha}(E[R]-{\alpha}\bar{q}_{1-\alpha}(R)).

Since $-\tilde{q}_{1-\alpha}(R)=\bar{q}_{\alpha}(X)$ , bPOE is defined as

\bar{p}_{x}(X)=\{1-\alpha|\bar{q}_{\alpha}(X)=x\}=\{1-\alpha|\tilde{q}_{1-\alpha}(R)=-x\}.

The superquantile or bPOE portfolio optimization problem has its objective function or one of the constraints defined in terms of $\tilde{q}_{1-\alpha}(w^{T}R)$ or $\bar{p}_{x}(w^{T}R)$ . To formulate such a problem using a given distribution, we begin by defining a set of qualified distribution’s which we will consider. These qualified distribution’s satisfy the following set of conditions, which allow us to verify that they make sense in terms of portfolio theory and admit superquantile/bPOE expression in a way that can represented in terms of decision variable $w$ :

A qualified distribution $\mathcal{D}$ satisfies the following conditions:
(C1) $w^{T}R\sim\mathcal{D}\implies\tilde{q}_{1-\alpha}(w^{T}R)=w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta)$ , where $\zeta(\alpha,\Theta)$ is a function depending only upon $\alpha$ and possibly a set of fixed parameters $\Theta$ that do not depend on $w$ , $\eta$ is the vector of the expected asset returns, and $\Sigma$ is the covariance matrix for asset returns.
(C2) The statistical parameters of the distribution $\mathcal{D}$ must be consistent with the descriptive statistics of real-life asset returns.
(C3) The shape of the PDF for the given distribution $\mathcal{D}$ must conform to the shape of the empirical PDF of typical real-life asset returns.

Why should we enforce these preconditions? Condition (C1) guarantees that the superquantile can be expressed in terms of $w$ . This is necessary to express the superquantile optimization problem. For example, if we assume that $w^{T}R\sim Logistic(\mu,s)$ , we need to be able to express $\mu$ and $s$ in terms of $w$ . Since $\mu=E[w^{T}R]=w^{T}\eta$ and $w^{T}\Sigma w=\sigma^{2}(w^{T}R)=\frac{s^{2}\pi^{2}}{3}$ , we have

	$\displaystyle\tilde{q}_{1-\alpha}(R)$	$\displaystyle=\frac{1}{1-\alpha}(E[R]-{\alpha}\bar{q}_{1-\alpha}(R))=\frac{1}{1-\alpha}(\mu-\alpha[\mu+\frac{s}{\alpha}(-(1-\alpha)ln(1-\alpha)-\alpha ln(\alpha))])$
		$\displaystyle=\mu-\frac{s}{1-\alpha}(-\alpha ln(\alpha)-(1-\alpha)ln(1-\alpha))$
		$\displaystyle=w^{T}\eta-\sqrt{w^{T}\Sigma w}\frac{\sqrt{3}(-\alpha\ln(\alpha)-(1-\alpha)\ln(1-\alpha))}{\pi(1-\alpha)}\;,$

which satisfies (C1). Other examples that satisfy this condition are Laplace, Normal, Exponential, Student $-t$ , Pareto, GPD, and GEV. Note that for Student $-t$ , we assume that parameter $\nu$ is fixed and the same for all assets, i.e. $\Theta=\{\nu\}$ , and for GPD/GEV distribution’s $\Theta=\{\xi\}$ .

Condition (C2) and (C3) are simple sanity checks on our model assumptions. For example, for exponential distribution $E[R]=\frac{1}{\lambda}=\sigma(R)$ , however for the real-life asset returns the sample mean is not generally equal to the sample standard deviation. So, Exponential and Pareto distribution’s make no sense in portfolio optimization problems even if they satisfy (C1). As for (C3), a distribution is not practical if there is obvious discrepancy between the shape of its PDF and the shape of the empirical PDF observed using real-life asset returns. The latter is generally bell-shaped or, more likely, inverse-V shaped, and is never shaped like the PDF of an Exponential, Pareto/GPD, or Weibull for $k<1$ .

This leaves us with a set of four elliptical distribution’s which satisfy all three conditions: Logistic, Laplace, Normal, and Student $-t$ , as well as the nonelliptical GEV distribution. For the latter, with $\xi\neq 0$ the left superquantile can be expressed as

\tilde{q}_{1-\alpha}(R)=w^{T}\nu-\sqrt{w^{T}\Sigma w}\frac{\alpha\Gamma(1-\xi)-\Gamma_{U}(1-\xi,ln(\frac{1}{1-\alpha}))}{(1-\alpha)\sqrt{g_{2}-g_{1}^{2}}},

where $\Gamma_{U}(a,b)=\int_{b}^{\infty}p^{a-1}e^{-p}dp$ is the upper incomplete Gamma function, $g_{k}=\Gamma(1-k\xi)$ .

An alternative to the Markowitz problem is to find the portfolio with minimal superquantile (3) or bPOE (4).

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{min }}$	$\displaystyle-\tilde{q}_{1-\alpha}(w^{T}R)$	(3)
$\displaystyle s.t.$	$\displaystyle w^{T}1=1$
$\displaystyle l\leq w\leq u$

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{min }}$	$\displaystyle\bar{p}_{x}(w^{T}R)$	(4)
$\displaystyle s.t.$	$\displaystyle w^{T}1=1$
$\displaystyle l\leq w\leq u$

For qualified distribution’s, however, these problems can be greatly simplified. First, we see that (3) reduces to (5):

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }}$	$\displaystyle w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta)$	(5)
$\displaystyle s.t.$	$\displaystyle w^{T}1=1$
$\displaystyle l\leq w\leq u$

Khokhlov (2016) shows that the optimal solution to (5) is the same as the optimal solution to a Markowitz optimization problem (2) with $\lambda=\frac{\zeta(\alpha,\Theta)}{2\sigma(w^{T}R)}$ . Thus, the superquantile optimal portfolio is also mean-variance optimal in the Markowitz sense.

Now, for bPOE we see that the picture is actually much simpler. Specifically, we have the following proposition.

If we assume that $w^{T}R\sim\mathcal{D}$ and we have that $\mathcal{D}$ is a qualified distribution, then (4) reduces to (6).

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }}$	$\displaystyle\frac{w^{T}\eta+x}{\sqrt{w^{T}\Sigma w}}$	(6)
$\displaystyle s.t.$	$\displaystyle w^{T}1=1$
$\displaystyle l\leq w\leq u$

First, note that by definition of the superquantile, we know that $\zeta(\alpha,\Theta)$ must be an increasing function w.r.t. $\alpha\in[0,1]$ . Second, as $\bar{p}_{x}(X)=\{1-\alpha|\tilde{q}_{1-\alpha}(R)=-x\}$ and $\tilde{q}_{1-\alpha}(w^{T}R)=w^{T}\eta-\sqrt{w^{T}\Sigma w}\zeta(\alpha,\Theta)$ for qualified distribution’s, the problem (4) can be rewritten as:

which can then be written as:

Finally, since $\zeta(\alpha,\Theta)$ is an increasing function w.r.t. $\alpha$ and $\Theta$ does not dependent upon $w$ , we see that we can formulate the maximization as (6) without changing the argmin. ∎

This proposition has the important implication for portfolio theory that the optimal bPOE portfolio for the qualified distribution does not depend on the distribution itself. The same portfolio will have the lowest bPOE for any of those distribution’s. The fact that bPOE optimization is, in some sense, independent from distributional assumptions makes it preferable to superquantile optimization.

In this example, we consider a global equity portfolio that consists of 6 market portfolios - U.S., Japan, U.K., Germany, France, and Switzerland, represented by the corresponding MSCI indices - MXUS, MXJP, MXGB, MXDE, MXFR, MXCH. Parameters of returns for portfolio components are provided in Table 1 (source: Capital IQ sample of monthly returns from April 1987 to April 1996, annualized).

Table 1: Portfolio Return Data

Asset	Expected	Standard	Correlations
ticker	return	deviation	MXUS	MXJP	MXGB	MXDE	MXFR	MXCH
MXUS	10.25%	13.79%	1	0.190041	0.639133	0.481857	0.499406	0.605384
MXJP	6.90%	26.05%	0.190041	1	0.450337	0.251601	0.378753	0.373964
MXGB	8.81%	19.16%	0.639133	0.450337	1	0.579918	0.584215	0.654687
MXDE	9.15%	20.31%	0.481857	0.251601	0.579918	1	0.753072	0.628426
MXFR	8.83%	20.40%	0.499406	0.378753	0.584215	0.753072	1	0.580626
MXCH	13.85%	17.45%	0.605384	0.373964	0.654687	0.628426	0.580626	1

This problem was solved using a non-linear programming algorithm and the results are provided in Table 2. The respective values of $\lambda$ are also provided, which allows deriving the same portfolios using the standard MVO solver that uses a quadratic programming algorithm.

Table 2: Optimal Superquantile (CVaR) Portfolio’s

Asset	min risk	CVaR 99% optimal portfolios				CVaR 95% optimal portfolios
ticker	portfolio	normal	t (df=3)	Laplace	logistic	normal	t (df=3)	Laplace	logistic
MXUS	70.99%	65.80%	67.59%	67.03%	66.53%	64.23%	64.78%	65.05%	64.64%
MXJP	13.98%	9.61%	11.11%	10.64%	10.21%	8.28%	8.74%	8.97%	8.62%
MXGB	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%
MXDE	9.24%	2.87%	5.07%	4.37%	3.76%	0.95%	1.61%	1.94%	1.44%
MXFR	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%
MXCH	5.79%	21.72%	16.22%	17.96%	19.50%	26.54%	24.87%	24.04%	25.30%
Return	9.89%	10.68%	10.40%	10.49%	10.57%	10.91%	10.83%	10.79%	10.85%
St.dev.	12.86%	13.01%	12.93%	12.95%	12.97%	13.11%	13.08%	13.06%	13.09%
$\lambda$		20.48	31.28	26.82	23.80	15.73	17.11	17.88	16.73

Table 2 shows that superquantile optimal portfolios are not the same as the global minimum variance portfolio (min. risk portfolio), but are quite close to it. Distributional assumptions play their role in the optimal portfolios composition, with the Student-t distribution rendering the most conservative allocation for CVaR 99%. However, the differences between optimal portfolios for CVaR 95% are insignificant.

We can note from (5) that if portfolio return is constrained from below, unless this constraint is very close to the return of the global minimum variance portfolio, it results in the superquantile optimization being essentially the same as the variance minimization. If the risk is constrained from above, that superquantile optimization is the same as return maximization.

Using the same set of assets, we also solved the bPOE optimization problem (4) with thresholds $x=0.16$ and $x=0.25$ (i.e. losses exceeding 16% and 25% of the initial portfolio respectively), $l=0$ , and $u=1$ . Table 3 shows the results: the minimal bPOE achieved, the optimal portfolio composition and parameters, and CVaR for the optimal portfolios for all distribution’s.

Table 3: Optimal bPOE Portfolio’s

Assumed distribution	normal	t (df=3)	Laplace	logistic	normal	t (df=3)	Laplace	logistic
bPOE threshold, $x$	16%				25%
bPOE value, $\bar{p}_{x}(X)$	5.13%	6.21%	7.46%	6.36%	0.80%	2.93%	2.81%	1.86%
Asset tiker	bPOE-optimal portfolio composition
MXUS	64.20%	64.19%	64.20%	64.20%	65.95%	65.95%	65.95%	65.95%
MXJP	8.26%	8.27%	8.25%	8.25%	9.73%	9.73%	9.73%	9.73%
MXGB	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%
MXDE	0.90%	0.91%	0.90%	0.90%	3.05%	3.05%	3.06%	3.05%
MXFR	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%	0.00%
MXCH	26.64%	26.63%	26.64%	26.65%	21.27%	21.27%	21.27%	21.27%
Return	10.92%	10.92%	10.92%	10.92%	10.65%	10.65%	10.65%	10.65%
St.dev.	13.12%	13.12%	13.12%	13.12%	13.00%	13.00%	13.00%	13.00%
Test distribution	CVaR for the test distribution’s
normal	16.00%	14.93%	13.87%	14.79%	25.00%	18.95%	19.16%	21.16%
t (df=3)	18.14%	16.00%	14.05%	15.74%	46.31%	25.00%	25.56%	31.46%
Laplace	19.48%	17.70%	16.00%	17.48%	36.62%	24.61%	25.00%	28.79%
logistic	17.61%	16.18%	14.81%	16.00%	31.14%	21.71%	22.01%	25.00%

All the optimal solutions were generated by the non-linear programming algorithm for different distribution’s, but the results support the conclusion that the optimal portfolio composition does not depend on the distribution (small discrepancies are due to the optimization algorithm accuracy).

One of the motivation’s for providing closed-form superquantile formulas is so that they can be used within common parametric estimation frameworks. The Exponential, Parteo/GPD, Laplace, Normal, LogNormal, Logistic, Student-t, Weibull, LogLogistic, and GEV represent a wide range of distribution’s that can now be utilized within these parametric procedures, but with superquantile’s incorporated into the fitting criteria. We illustrate this idea by proposing a simple variation of the Method of Moments (MM), which we call the Method of Superquantile’s (MOS), where superquantile’s at varying levels of $\alpha$ take the place of moments. Our numerical example utilizes a heavy tailed Weibull to illustrate MOS, since it is particularly well-suited for asymmetric heavy-tailed data. However, any of the listed distribution’s could be used as well.

The MM is a well known tool for estimating the parameters of a distribution when moments are available in parametric form and desired moments are either assumed to be known or are measured from empirical observations. It looks for the distribution $f_{\Theta}(x)$ , parameterized by $\Theta$ , with moments equal to some known moments or, if moments are unknown, empirical moments. With $n$ moments used, the problem reduces to solving a set of $n$ equations w.r.t. the set of parameters $\Theta$ of the distribution family.

This method, though, can be generalized where moments are replaced by other distributional characteristics, such as the superquantile and quantile. We utilize superquantile’s in this context. This method provides flexibility through the choices of different $\alpha$ , allowing the user to focus the fitting procedure on particular portions of the distribution. This flexibility is advantageous compared to other methods such as MM or maximum likelihood (ML), since these fitting methods treat each portion of the distribution equally. When fitting the tail is important, for example, and there are many samples around the mean with few samples in the tail, it can be desirable to focus the fitting procedure on carefully fitting the tail samples. As will be shown, one can focus MOS by choice of $\alpha$ . One will see that this procedure is similar to fitting with Probability Weighted Moments (PWM)⁵⁵5also sometimes called L-moments., but where MOS is much more straightforward with superquantiles far easier to interpret than PWM’s.

We formulate the following problem, where $\hat{\bar{q}}_{\alpha}(X)$ denotes either a known superquantile or an empirical estimate from a sample of $X$ and $\bar{q}_{\alpha}(X_{f_{\Theta}})$ denotes parameterized formulas for the superquantile when $X$ has density function $f_{\Theta}$ with the set of parameters $\Theta$ :

Method of Superquantiles: Fix $\alpha_{1},...,\alpha_{k}\in[0,1]$ and choose a parametric distribution family $f_{\Theta}$ with parameters $\Theta$ . Solve for $\Theta$ such that,

\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})=\hat{\bar{q}}_{\alpha_{i}}(X)\text{ for all }i=1,...,k,

which is a system of $k$ equations in $|\Theta|$ unknowns.

This problem, however, may not have a solution. For example, if $k=2$ and the parametric family only has a single parameter (i.e. $|\Theta|=1$ ). In this case, one can solve the following surrogate Least Squares minimization problem:

LS Method of Superquantiles (LS-MOS): Fix $\alpha_{1},...,\alpha_{k}\in[0,1]$ and choose a parametric distribution family $f_{\Theta}$ with parameters $\Theta$ . Choose weights $c_{1},...,c_{k}>0$ and solve for,

\Theta\in\underset{\Theta}{\text{argmin }}\sum_{i}c_{i}\left(\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})-\hat{\bar{q}}_{\alpha_{i}}(X)\right)^{2}\;.

This procedure finds the distribution which has superquantile’s that are close to the empirical superquantile’s. The freedom to select $\alpha_{i}$ as well as $c_{i}$ provides the user with much flexibility as to which portion of the distribution should match more exactly the empirical superquantile’s.

When sample size is small and the tail of the distribution at hand is long, it is likely that the tail will be difficult to characterize from empirical data since few observations will be observed in the tail (with high probability). The proposed method of superquantile’s can easily, however, be made more conservative based upon empirical data in an intuitive way. For example, one could have the following condition where $\epsilon_{i}$ is a pre-specified constant such that $0<\epsilon_{i}\leq\alpha_{i}$ :

\bar{q}_{\alpha_{i}-\epsilon_{i}}(X_{f_{\Theta}})=\hat{\bar{q}}_{\alpha_{i}}(X).

Or, for the least squares variant, one can fit the problem,

\min_{\Theta}\sum_{i}c_{i}(\bar{q}_{\alpha_{i}-\epsilon_{i}}(X_{f_{\Theta}})-\hat{\bar{q}}_{\alpha_{i}}(X))^{2}

Notice that these conditions are effectively making the assumption that the empirical superquantile has underestimated the true tail expectation, which is often the case with heavy tailed distribution’s.

We illustrate the basic method on fitting a Weibull distribution, with $\Theta=(\lambda,k)$ , from a small sample of 50 observations. We took two independent samples, denoted $S_{1},S_{2}$ , of size 50 from a Weibull with $\lambda=.5,k=1.4$ . We then estimated the Weibull parameters using MM, ML, and the LS-MOS. The MM was solved using the first two moments. The LS-MOS was solved twice. It was first solved with $\alpha_{1}=.15,\alpha_{2}=.75,c_{1}=c_{2}=1$ , a choice which was made to mimic the behavior of MM and ML, where the fit emphasizes most of the observed data. To put more emphasis on the tail observations, it was also solved with $\alpha_{1}=.5,\alpha_{2}=.75,\alpha_{3}=.95,c_{1}=c_{2}=c_{3}=1$ . We denote these solutions as LS1, LS2 respectively. The ML solution is available in closed-form and we solved MM, LS1, and LS2 using Scipy’s optimization library.⁶⁶6Specifically, we used the leastsq function which implements MINPACK’s lmdif routine. This routine requires function values and calculates the Jacobian by a forward-difference approximation.

Looking at Figure 2 for $S_{1}$ and Figure 3 for $S_{2}$ , we see that the LS1 fit is, indeed, much like the MM a ML fit for both data sets. However, we see that the LS2 fit is the best in both cases. The ML, MM, and LS1 methods have put too much emphasis on the observations around the mode. The LS2 fit, however, has put appropriate emphasis on the less frequent observations in the tail.

It is also important to notice how the differences in $S_{1}$ and $S_{2}$ have affected the fit from each method. Looking at the differences between $S_{1}$ and $S_{2}$ , we can see that the samples differ in the observed density in the lower portion of the range. This is directly reflected in the fits given by MM, ML, and LS1. Compared to their fits on $S_{1}$ , they are more heavily favoring the left side of the distribution. The LS2 fit, however, is robust to these differences between data sets and, by focusing on the tail, has remained mostly unchanged from the fit on $S_{1}$ . This is the intended effect from the selection of larger values for $\alpha$ in LS2.

We duplicated this procedure on a heavier tailed Weibull. We took 50 samples of a Weibull with true parameters $k=1,\lambda=.5$ and fit MM, ML, LS1, and LS2 using the empirical data. Figure 4 and Figure 5 highlight different aspects of the resulting fits. We see that LS2 clearly provides the best fit, with Figure 5 in particular showing that MM, ML, and LS1 underestimate the tail densities. MM, ML, and LS1 put more emphasis on fitting the observations around the mode. As intended, however, LS2 focuses more on fitting the right tail observations and arrives at a better fit.

[Uncaptioned image] — Figure 2: Fits using sample $S_{1}$ . PDF’s displayed with normalized histogram of $S_{1}$ sample given in background.

While we focused primarily on a variant of the method of moments, the formulas provided for superquantile’s and bPOE can be used in other parametric procedures. For example, one could consider a constrained variant of the maximum likelihood or maximum entropy method, where superquantile constraints are introduced. Letting $H(f_{\Theta})$ denote the entropy of the random variable with density function $f_{\Theta}$ , and $y_{i}$ denote an observation, constrained entropy maximization and maximum likelihood can be set up as follows:

ML:\;\;\max_{f_{\Theta}\in\mathcal{F}}\sum_{i}\log(f_{\Theta}(y_{i}))\;\;,\;\;ME:\;\;\max_{f_{\Theta}\in\mathcal{F}}H(f_{\Theta})

where $\mathcal{F}=\{f_{\Theta}\;|\;\bar{q}_{\alpha_{i}}(X_{f_{\Theta}})\leq\hat{\bar{q}}_{\alpha_{i}}(X)\;\;\forall i=1,...,k\}$ .

While we leave full exploration of this framework for future work, this simple formulation illustrates another potential use for the provided superquantile and bPOE formulas within traditional parametric frameworks.

In this paper, we first derived closed-form formulas for the superquantile and bPOE, then utilized them within parametric portfolio optimization and density estimation problems. We were able to derive superquantile formulas for a variety of distribution’s, including ones with exponeitial tails (Exponential, Pareto/GPD, Laplace), symmetric distribution’s (Normal, Laplace, Logistic, Student-t), and asymmetric distribution’s with heavy tails (LogNormal, Weibull, LogLogistic, GEV). With bPOE formulas, while we had less success deriving truly closed-form solutions, we saw that it can still be calculated by solving a one-dimensional convex optimization problem or one-dimensional root finding problem.

We then utilized these formulas to develop two parametric procedures, one in portfolio optimization and one in density estimation. We first found that formulas for Normal, Laplace, Student-t, Logistic, and GEV are all distribution’s which yield tractable superquantile and bPOE portfolio optimization problems. Furthermore, we found that bPOE-optimal portfolios are more robust to changing distributional assumptions compared to superquantile-optimal portfolios. Specifically, bPOE optimal portfolios are optimal, simultaneously, for an entire class of distributions. Finally, we presented a variation on the method of moments where moments are replaced by superquantile’s. This parametric procedure is made possible by our closed-form formula’s and we illustrate its use in the case of heavy tailed assymetric data, where additional emphasis on fitting the tail via superquantile conditions can be highly desirable. We find that this method makes it easy to direct the focus of the fitting procedure on tail samples.

Andreev et al. (2005) Andreev A, Kanto A, Malo P (2005) On closed-form calculation of cvar. Helsinki School of Economics Working Paper W-389
Artzner et al. (1999) Artzner P, Delbaen F, Eber JM, Heath D (1999) Coherent measures of risk. Mathematical Finance 9:203–228
Davis and Uryasev (2016) Davis JR, Uryasev S (2016) Analysis of tropical storm damage using buffered probability of exceedance. Natural Hazards 83(1):465–483
Karian and Dudewicz (1999) Karian ZA, Dudewicz EJ (1999) Fitting the generalized lambda distribution to data: a method based on percentiles. Communications in Statistics-Simulation and Computation 28(3):793–819
Khokhlov (2016) Khokhlov V (2016) Portfolio value-at-risk optimization. Wschodnioeuropejskie Czasopismo Naukowe 13(2):107–113
Landsman and Valdez (2003) Landsman ZM, Valdez EA (2003) Tail conditional expectation for elliptical distributions. North American Actuarial Journal 7(4):55–71
Mafusalov and Uryasev (2018) Mafusalov A, Uryasev S (2018) Buffered probability of exceedance: Mathematical properties and optimization. SIAM Journal on Optimization 28(2):1077–1103
Mafusalov et al. (2018) Mafusalov A, Shapiro A, Uryasev S (2018) Estimation and asymptotics for buffered probability of exceedance. European Journal of Operational Research 270(3):826–836
Norton and Uryasev (2016) Norton M, Uryasev S (2016) Maximization of auc and buffered auc in binary classification. Mathematical Programming pp 1–38
Norton et al. (2017) Norton M, Mafusalov A, Uryasev S (2017) Soft margin support vector classification as buffered probability minimization. The Journal of Machine Learning Research 18(1):2285–2327
Rockafellar and Royset (2010) Rockafellar R, Royset J (2010) On buffered failure probability in design and optimization of structures. Reliability Engineering & System Safety, Vol 95, 499-510
Rockafellar and Uryasev (2000) Rockafellar R, Uryasev S (2000) Optimization of conditional value-at-risk. The Journal of Risk, Vol 2, No 3, 2000, 21-41
Rockafellar and Royset (2014) Rockafellar RT, Royset JO (2014) Random variables, monotone relations, and convex analysis. Mathematical Programming 148(1-2):297–331
Rockafellar and Uryasev (2002) Rockafellar RT, Uryasev S (2002) Conditional value-at-risk for general loss distributions. Journal of banking & finance 26(7):1443–1471
Sgouropoulos et al. (2015) Sgouropoulos N, Yao Q, Yastremiz C (2015) Matching a distribution by matching quantiles estimation. Journal of the American Statistical Association 110(510):742–759
Shang et al. (2018) Shang D, Kuzmenko V, Uryasev S (2018) Cash flow matching with risks controlled by buffered probability of exceedance and conditional value-at-risk. Annals of Operations Research 260(1-2):501–514
Uryasev (2014) Uryasev S (2014) Buffered probability of exceedance and buffered service level: Definitions and properties. Department of Industrial and Systems Engineering, University of Florida, Research Report 3

	$\displaystyle\bar{p}_{x}(X)$	$\displaystyle=\{1-\alpha\|\bar{q}_{\alpha}(X)=x\}$
		$\displaystyle=\{1-\alpha\|\frac{-\ln(1-\alpha)+1}{\lambda}=x\}$
		$\displaystyle=\{1-\alpha\|\ln(1-\alpha)=1-\lambda x\}$
		$\displaystyle=\{1-\alpha\|e^{\ln(1-\alpha)}=e^{1-\lambda x}\}=\{1-\alpha\|1-\alpha=e^{1-\lambda x}\}=e^{1-\lambda x}$

$\displaystyle\underset{w\in\mathbb{R}^{n}}{\text{max }}$

$\displaystyle L(w,R)$

$\displaystyle s.t.$

$\displaystyle g_{i}(w,R)\leq 0,i=1,.,I$

$\displaystyle h_{j}(w,R)=0,j=1,.,J$

$\displaystyle w^{T}1=1$

$\displaystyle l\leq w\leq u$