Quantitative Statistical Robustness for Tail-Dependent Law Invariant Risk Measures
Abstract
When estimating the risk of a financial position with empirical data or Monte Carlo simulations via a tail-dependent law invariant risk measure such as the Conditional Value-at-Risk (CVaR), it is important to ensure robustness of the statistical estimator particularly when the data contain noise. Krätscher et al. [1] propose a new framework to examine the qualitative robustness of estimators for tail-dependent law invariant risk measures on Orlicz spaces, which is a step further from earlier work for studying the robustness of risk measurement procedures by Cont et al. [2]. In this paper, we follow the stream of research to propose a quantitative approach for verifying the statistical robustness of tail-dependent law invariant risk measures. A distinct feature of our approach is that we use the Fortet-Mourier metric to quantify variation of the true underlying probability measure in the analysis of the discrepancy between the laws of the plug-in estimators of law invariant risk measure based on the true data and perturbed data, which enables us to derive an explicit error bound for the discrepancy when the risk functional is Lipschitz continuous with respect to a class of admissible laws. Moreover, the newly introduced notion of Lipschitz continuity allows us to examine the degree of robustness for tail-dependent risk measures. Finally, we apply our quantitative approach to some well-known risk measures to illustrate our theory.
Keywords. Quantitative robustness, tail-dependent law invariant risk measures, Fortet-Mourier metric, admissible laws, index of quantitative robustness.
1 Introduction
One of the main purposes of quantitative modeling in finance is to quantify the loss of a financial portfolio. Over the past two decades, various risk measures have been proposed for measuring the risk of financial portfolios. A risk measure is represented as a map assigning an extended real number (a measure of risk) to each random loss under an implicit assumption that the true loss probability distribution is known. However, in practice, the true probability distribution is often unknown or it is prohibitively expensive to calculate the risk using the true distribution. Thus, in applications, evaluating the risk of a random variable representing the loss of a financial position often involves two steps: estimating the probability distribution from available observations or from the sampling data of the random financial loss via, e.g., Monte Carlo method and then plugging the estimated distribution into a risk measure to quantify the financial loss. This is because the risk measures are mostly law invariant, that is, they are determined only by the probability distributions of random variables. For the loss of a financial portfolio, a measure of risk computed based on the estimated distribution is known as a plug-in estimate for the risk measure [3].
Let denote the random loss of a financial portfolio on a probability space and be a law invariant risk measure. The plug-in estimate for is given by , where is the empirical distribution based on available observations and is a risk functional defined by
(1.1) |
see e.g. [4, 5]. In the literature, Cont et al. [2] first study the quality of statistical estimators of the law invariant risk measures using Hampel’s classical concept of qualitative robustness [6], that is, a risk functional estimator is said to be qualitatively robust if it is insensitive to the variation of the sampling data. The research is important because perceived data (particularly empirical data) may contain some noise. Without such insensitivity, financial activities based on the risk measures may cause damage. For instance, when is applied to allocate the risk capital for an insurance company, altering the capital allocation may be costly. According to Hampel’s theorem, Cont et al. [2] demonstrate that the qualitative robustness of a statistical estimator is equivalent to the weak continuity of the risk functional, and that value at risk (VaR) is qualitatively robust whereas conditional value at risk (CVaR) is not.
Krätschmer et al [7] argue that the use of Hampel’s classical concept of qualitative robustness may be problematic because it requires the risk measure essentially to be insensitive with respect to the tail behaviour of the random variable and the recent financial crisis shows that a faulty estimate of tail behaviour can lead to a drastic underestimation of the risk. Consequently, they propose a refined notion of qualitative robustness that applies also to tail-dependent statistical functionals and that allows us to compare statistical functionals in regards to their degree of robustness. The new concept captures the trade-off between robustness and sensitivity and can be quantified by an index of qualitative robustness. Furthermore, under the new concept, Krätschmer et al [1] analyze the qualitative robustness to the law-invariant convex risk measure on Orlicz spaces and show that CVaR and spectral risk measures are all qualitatively robust when the perturbation of probability distribution is restricted to a finer topological space. Alternative generalizations of Hampel’s theorem can be found for strong mixing data (Zähle [8, 9]) and for stochastic processes in various ways (Boente et al [10] and Strohriegl and Hable [11]). For comprehensive study of statistical robustness, we refer readers to [12, 13, 14, 15] and references therein.
In this paper, we take a step further by deriving an error bound for the plug-in estimators of law invariant risk measures in terms of the variation of data and we call the analysis quantitative because no such error bound is established in the existing qualitative robust analysis. This is achieved by adopting different metrics to measure the discrepancy of the estimators and the variation of data. Specifically, we use the Fortet-Mourier metrics as opposed to the Lévy distance in Cont et al. [2] or the weighted Kolmogorov metric in Krätschmer et al. [7] to quantify the data variation (the perturbation of the true probability distributions). Moreover, we introduce a new notion of the so-called admissible laws, which effectively restrict the scope of data variation. The new metrics enable us to establish an explicit relationship between the discrepancy of the laws of the plug-in estimators (of law invariant risk measure based on the true data and perturbed data) and the discrepancy of the associated probability distributions of the data. The research is inspired by the recent work of Guo and Xu [16] where the authors derive quantitative statistical robustness for preference robust optimization models under Kantorovich metric. The main contributions of the paper can be summarized as follows.
First, we introduce the notion of admissible laws induced by a probability metric, which is a class of probability distributions whose discrepancy with the law of the Dirac measure at is finite. The admissibility effectively restricts the scope of data perturbation. Using the notion, we compare the admissibility under -topology and the Fortet-Mourier metric.
Second, we propose to use the Fortet-Mourier metric to quantify the variation of the probability measure. The metric enables us to establish an explicit relationship between the discrepancy of the laws of the plug-in estimators of law invariant risk measure based on the true data and perturbed data by noise and the change of the true underlying probability measures when the risk functional is Lipschitz continuous on a class of admissible laws. We find that the risk functionals associated with the general moment-type convex risk measures are Lipschitz continuous.
Third, we introduce the concept of Lipschitz continuity for a general statistical functional on a class of admissible laws induced by the Fortet-Morier metric and find that for the Lipschitz continuous risk measure, the parameter of the Fortet-Mourier metric allows us to compare the tail-dependent risk measures with regard to their degree of robustness, i.e., the index of statistical robustness.
Fourth, we apply the new approach to examine the quantitative statistical robustness of a range of well known risk measures, including CVaR, optimized certainty equivalent, shortfall risk measure and conclude that under mild conditions, they are all quantitatively robust, and the indexes of quantitative robustness to them are also calculated.
The rest of the paper is organized as follows. In Section 2, we set up the background of the problem for research. In Section 3, we introduce the concept of Fortet-Mourier metric and admissible laws. In section 4, we establish the quantitative statistical robustness theory and compare with the qualitative statistical robustness theory. In section 5, we apply our theory to risk measures and give some examples. Some technical details are given in the appendix.
2 Problem statement
In this section, we discuss the background of statistical robustness in the context of law invariant risk measures. We begin by a brief review of law invariant risk measures and its estimation, and then move to explain the issues when the data may contain noise.
Let be an atomless probability space, where is a sample space with sigma algebra and is a probability measure. Let be a financial loss and be the law or the probability distribution of . For , let ( for short) denote the space of random variables mapping from to with finite -th order moments. We say that a map is a convex risk measure111We note that the canonical model space for law invariant convex risk measure is [17]. [18] if it satisfies the following properties:
-
(i)
Monotonicity: for with -almost surely;
-
(ii)
Translation invariance: for and ;
-
(iii)
Convexity: for and .
Moreover, if satisfies positive homogeneity, i.e., for any , , then is a coherent risk measure, see [19, 18] for the original definitions of these concepts. A risk measure is said to be law invariant if for and having the same law. We refer readers to Föllmer and Weber [20] for a recent overview of risk measures.
As discussed in [2, 7], it is a widely-accepted procedure to estimate the risk of a financial loss by means of a Monte Carlo method or from a set of available observations. Such a procedure is particularly sensible when is law invariant. The following proposition states that the law invariance of a risk measure is equivalent to the existence of a risk functional in (1.1).
Proposition 2.1
Let denote the set of all probability measures on . If is a law invariant risk measure, then there exists a unique risk functional associated with such that for any ,
(2.1) |
The result is well-known, see for instance Delage et al. [21] for random variables defined in . The usefulness of the representation is that it naturally captures the law invariance and allows one to define any law invariant risk measure directly over the space of probability measures induced by random variables in (also known as probability distributions), see Fritelli et al. [22]. Dentcheva and Ruszczyński [23] take it further to define a class of law invariant risk measures in the space of quantile functions directly. In a more recent development, Haskell et al. [24] extend the research to a broad class of multi-attribute choice functions defined over the space of survival functions. Let be the push-forward probability measure on induced by . Since coincides with ( for short), we also call the distribution or the law of interchangeably throughout the paper. Consequently, we can write (2.1) as (1.1).
In this paper, we are not concerned with the definition of risk measures over the space of probability distributions or the space of quantile functions, rather we concentrate on the stability of statistical estimators of law invariant risk measures. The risk functional with the law can be used in a natural way to construct an estimator for the risk of . All one needs to do is to take an estimate of based on the available observations of and then to plug this estimator into the risk functional to obtain the desired estimator of , i.e.,
(2.2) |
where in this paper, can be seen as the empirical distribution of an independent and identically distributed (i.i.d., for short) sequence of historical observations or Monte Carlo simulations, i.e.,
(2.3) |
Here and later on denotes the indicator function of event . Indeed, can be a fairly general estimates, for instance, can be a smoothed empirical distribution based on uncensored data or empirical distribution based on censored data, see, e.g., [3] or empirical distribution based on identically distributed dependent data, see, e.g., [9].
We can see that is a mapping from to . Figure 1 illustrates the relationship between the risk functionals, their estimators and the spaces associated.
In practice, the samples obtained from empirical data may contain noise. In that case, we might regard the samples as generated by a perturbed random variable with law , that is, . Let be i.i.d samples from . Then the practical empirical distribution function for estimating the law of is
(2.4) |
and the practical estimator is with perceived empirical data whereas is a statistical estimator with noise being detached. Since we are unable to obtain the latter, we tend to use the former as a statistical estimator of and this works only if the two estimators are sufficiently close.
To quantify the closeness, we may look into the discrepancy between the laws of the two estimators under some metric , i.e.,
(2.5) |
where and denote the probability measures on measurable space with marginals and on each respectively, denotes the corresponding Borel sigma algebra of . Since neither nor is known, we want the discrepancy to be uniformly small for all and over a subset of admissible laws on so long as is sufficiently close to under some metric . The uniformity may be interpreted as robustness. Qualitative robustness refers to the case that the relationship between and is implicit whereas quantitative robustness refers to the case that the relationship is explicit, i.e., a function of the latter can be used to bound the former, and this is what we aim to achieve in this paper because qualitative robustness have been well investigated, for instance, in [2, 1, 7].
3 -metrics and admissible laws
There are two essential elements in investigating both the qualitative and quantitative statistical robustness of a risk functional: One is the specific choice of probability metrics but not just the topologies generated by them, see, e.g., [12, 2, 7], to quantify the change of the law and to estimate the discrepancy between the laws of two estimators, i.e., (2.5); the other is the determination of the subset of admissible laws in (see, e.g., [7, 9]), containing all empirical distributions: , to restrict the perturbation of the law . For instance, the subset may be specified via some generalized moment conditions, which are interesting in econometric or financial applications.
To introduce these two essential elements thoroughly, some preliminary notions and results in probability theory and statistics such as -weak topology are required. We first give a sketch of them to prepare our discussions in the follow-up sections. Let be a continuous function and . In the particular case when and is a positive number, write for . Note that defines a subset of probability measures in which satisfies the generalized moment condition of . From the definition, we can see that for any positive numbers with due to Hölder inequality.
Definition 3.1 (-weak topology)
Let be a gauge function, that is, is continuous and holds outside a compact set. Define the linear space of all continuous functions for which there exists a positive constant such that
The -weak topology, denoted by , is the coarsest topology on for which the mapping defined by , is continuous. A sequence is said to converge -weakly to written if it converges w.r.t. .
Clearly, -weak topology is finer than the weak topology, and the two topologies coincide if and only if is bounded. It is well known (see [7, Lemma 3.4]) that -weak convergence is equivalent to weak convergence, denoted by , together with . Moreover, it follows by [7, 1] that the -weak topology on is generated by the metric defined by
(3.1) |
for , where is the Prokhorov metric defined by
(3.2) |
where denotes the Minkowski sum of and the open ball centred at on and is the corresponding Borel sigma algebra on . We note that the Prokhorov metric metrized the weak topology on see, e.g., [25].
3.1 -metrics
Instead of exploiting the widely-used probability metrics such as the Prokhorov metric and the weighted Kolmogorov metric in the literature of qualitative robustness [2, 7], we will switch to the so-called metrics with -structure to establish the quantitative statistical robustness framework for a risk functional. In particular, we will use the well-known Kantorovich metric and Fortet-Mourier metrics. The new metrics enable us to establish an explicit relationship between the discrepancy of the laws of the plug-in estimators of law invariant risk measures based on the true data and perturbed data with noise and the discrepancy of the associated true probability measure. We begin with a formal definition of -metrics and then clarify the relationships between metrics of -structure and those used in [26, 2, 7].
Definition 3.2
Let and be a class of measurable functions from to . The metric with -structure is defined by
(3.3) |
From the definition, we can see that is the maximum difference of the expected values of the class of measurable functions with respect to and . -metrics are widely used in the stability analysis of stochastic programming, see Römisch [27] for an excellent overview. The specific metrics with -structure that we consider in this paper are the Kantorovich metric and the Fortet-Mourier metric. The next definition gives a precise description of the two notions.
Definition 3.3 (Fortet-Mourier metric)
Let
(3.4) |
where for all and describes the growth of the local Lipschitz constants. The -th order Fortet-Mourier metric for is defined by
(3.5) |
In the case when , it is known as the Kantorovich metric for
(3.6) |
From the definition, we can see that for any positive numbers ,
(3.7) |
which means that becomes tighter as increases and they are all tighter than . Moreover, the Fortet–Mourier metric metricizes weak convergence on sets of probability measures possessing uniformly a -th moment [28, p. 350]. Notice that the function for belongs to . On , the Fortet–Mourier metric may be equivalently written as
(3.8) |
see, e.g., [29, p. 93].
In the next example, we illustrate the relationship between the existing probability metrics used in statistical robustness and the metrics with -structure.
Example 3.1
A number of well known probability metrics are used in the literature of statistical robustness.
(i) The Kantorovich (or Wasserstein) metric. Let be the set of all Lipschitz continuous functions with modulus being bounded by . Then
(3.9) |
Moreover, , see [25, Theorem 2].
(ii) The Lévy distance [29]. Let be the set of functions bounded by 1. Then
Moreover, and for any , see, e.g., [25].
(iii) The weighted Kolmogorov metric [7]. Let be a -shaped function, i.e., a continuous function that is non-increasing on and non-decreasing on . Then the weighted Kolmogorov metric is defined as
where is the set of all functions bounded by . Precisely, if is the set of all indicator functions , where , then , which is known as the Kolmogorov metric. Similarly, by letting be the set of all weighted indicator functions with weighting , one can obtain .
3.2 Admissible laws
We now turn to discuss another important component in statistical robust analysis, that is, the subset of admissible laws in which describes the scope of the perturbation of the law by a metric. This can be motivated by ensuring the finiteness of . To this effect, we formally introduce the concept of admissible laws induced by probability metrics.
Definition 3.4 (Admissible laws induced by probability metrics)
Let be a probability metric on . The admissible laws induced by are defined as
(3.10) |
where denotes the Dirac measure at .
Let denote the admissible laws induced by the Fortet-Mourier metrics with parameter on . By Definition 3.4, we have
(3.11) | |||||
By triangle inequality, this ensures for any .
In the following example, we compare the admissible laws induced by different probability metrics.
Example 3.2 (Admissible laws induced by probability metrics)
We reconsider the admissible laws induced by probability metrics defined in Example 3.1.
(i) The admissible laws induced by the Kantorovich (or Wasserstein) metric are defined as
where the second equality follows from the definition of the Kantorovich metric (see, (3.9)). To see how the third equality holds, we note that for any , we have
Since , then let , then we have . Similarly, we have . By using integration-by-parts formula (more precisely [30, Theorem 1.15]), we obtain the right hand side of the third equality. The last equality follows from the definition of -topology in which case .
(ii) The admissible laws induced by the Lévy distance are defined as
Since , then the admissible laws coincide with .
(iii) The admissible laws induced by the weighted Kolmogorov metric are defined as
which coincides with the set defined in Krätschmer at al. [7, subsection 3.2].
If is bounded on , then it is straight that . In the case when is unbounded on , then
(3.12) |
In what follows, we give a proof for (3.12). Let , since is a -shaped function, then for any and , we have
and consequently . Thus, .
On the other hand, for any , if we let for , then is a gague function. Moreover, for any , there exists a such that . To ease the exposition, we can assume that the law for any . Then
Thus
which implies . Summarizing the discussions above, we obtain (3.12).
We note that if is unbounded, then the inclusions in (3.12) are strict because we can find a counterexample showing equality may fail, see Example B.1 in the appendix.
(iv) The admissible laws induced by the Prokhorov metric are defined as
Since , then the admissible laws coincides with .
(v) The admissible laws induced by the Dudley’s (or Bounded) metric are defined as
Since , then the admissible laws coincide with .
3.3 Relationship with -weak topology
Since -weak topology has been widely used for qualitative robust analysis in the literature whereas we use the topology induced by the Fortet-Mourier metrics for quantitative robust analysis, it would therefore be helpful to look into potential connections of the two apparently completely different metrics. In the next proposition, we look into such connection from admissible set perspective (which defines the space of probability measures that is perturbed in both qualitative and quantitative robust analysis), we find that coincides with for some specific choice of and subsequently show that the Fortet-Mourier metric is tighter than .
Proposition 3.1
Let be fixed and
The following assertions hold.
-
(i)
.
-
(ii)
.
-
(iii)
metrizes the -weak topology on .
Part (i) of the proposition says that the admissible set coincides with the set of laws on satisfying the generalized moment condition of . Part (ii) indicates that is tighter than . Part (iii) means that the -weak topology on is generated by the metric .
Proof. Part (i). Since for any , , then by the definition of , we have that implies and subsequently, .
On the other hand, let , then . For any , we have
and consequently,
Therefore, we have
and consequently, .
Part (ii). Since , then for any ,
From Example 3.1(i) and (3.7), we have . Finally, by the definition of , i.e., (3.1), we obtain the conclusion.
Part (iii) follows straightforwardly from Part (ii).
Proposition 3.1 indicates that despite Fortet-Mourier metric and are different metrics, they generate the same topology, which confirms the statement at the beginning of this section, i.e., for the qualitative robustness and the quantitative robustness, the specific choice of probability metrics matters but not the topologies generated by them. To conclude this section, we remark that the subset to be used in the definition of qualitative robust analysis will be confined to the set of admissible laws when we adopt the Fortet-Mourier metric for quantitative robust analysis in the next section.
4 Statistical robustness
We are now ready to return our discussions to the robustness of statistical estimators of law invariant risk measures that are outlined in Section 2.
4.1 Qualitative statistical robustness
To position our research properly, we begin by a brief overview of the existing results about the qualitative statistical robustness.
Definition 4.1 (Qualitative -Robustness [2, 1])
Let be a subset of and . The sequence of estimators is said to be qualitatively -robust at w.r.t. if for every there exist and such that for all and
(4.1) |
If, in addition, arises as in (2.2) from a risk functional , then is called qualitatively -robust at w.r.t. .
The definition above captures two versions of qualitative statistical robustness proposed by Cont et al. [2] for i.i.d. observations on with and being Lévy distance and Krätchmer et al. [7] for i.i.d. observations on with and respectively. Since is tighter than , it means Krätchmer et al. [7] examines the discrepancy of the laws with a tighter metric. On the other hand, from the definition of , we can see that it is also tighter than and allows one to capture the difference of distributions at the tail, it means the robust analysis in Krätchmer et al. [7] is restricted to a smaller class of probability distributions when is perturbed from . This explains why CVaR is robust under the criterion of the latter but not the former.
A key result that Krätchmer et al. [7] establish is the Hampel’s theorem which states the equivalence between qualitative statistical robustness and stability/continuity of a risk functional (with respect to perturbation of the probability distribution) under uniform Glivenko-Cantelli (UGC) property of empirical distributions over a specified set.
Definition 4.2 (-Continuity [7])
Let and be a subset of . Then is called -continuous at w.r.t. if for every , there exists such that for all
Definition 4.3 (UGC Property [7])
Let be a subset of . Then we say that the metric space has the UGC property if for every and , there exists such that for all and
The UGC property means that convergence in probability of the empirical probability measure to the true marginal distribution uniformly in on . Examples for metrics spaces having the UGC property can be found in [7, Section 3]. In particular, it is shown that there exists a subset of the admissible laws induced by the weigthed Kolmogorov metric enjoys the UGC property, see [7, Theorem 3.1].
Theorem 4.1 (Hampel’s Theorem [7])
Let be a subset of and . Assume that has the UGC property and . Then if the mapping is -continuous at w.r.t. , the sequence is qualitatively -robust at w.r.t. .
4.2 Quantitative statistical robustness
We now move on to discuss our central topic, quantitative statistical robustness for the plug-in estimators of law invariant risk measures. Intuitively speaking, quantitative statistical robustness of a risk functional means that for any two admissible laws and on , the distance between the laws of their plug-in estimators and is bounded by the distance between and when the sample size is sufficiently large.
Definition 4.4 (Quantitative statistical robustness)
Let be probability metrics on and denote a subset of admissible laws on . A sequence of statistical estimators is said to be quantitative statistical robust on w.r.t. if there exists a non-decreasing real-valued continuous function with such that for all and
(4.2) |
If in addition, arise as in (2.2) from a risk functional , then is called quantitative statistical robust on at w.r.t. . In a particular case when , and , inequality (4.2) reduces to
(4.3) |
In comparison with the qualitative statistical robustness introduced by Krätchmer et al. [1] or Cont et al. [2], the definition (4.3) here has several advantages. First, we use Kantorovich metric instead of Prokhorov metric to quantify the discrepancy between and . This enables us to capture the tail behaviour of the two laws and facilitate us to derive an explicit bound for the difference. Second, we use the Fortet-Mourier metric to quantify the perturbation of , which is more sensitive than the Lévy metric used in [2] and the weighted Kolmogorov metric in [1, 8, 9, 13] to the variation of the tails. Third, inequality (4.3) gives an error bound for the discrepancy of the two laws and the bound is valid for all in instead of those in a neighborhood of .
Next, we introduce a definition on the Lipschitz continuity of a general statistical mapping from to , which strengthens the earlier definition of -continuity for a general statistical functional.
Definition 4.5 (Lipschitz continuity)
Let be a general statistical functional and be a subset of . is said to be Lipschitz continuous on w.r.t. if there exists a positive constant such that
(4.4) |
There are a few of points to note to the above definition of Lipschitz continuity:
-
1.
The Lipschitz continuity is global instead of local over . The condition is strong but we will find that many risk functionals are global Lipschitz continuous on some indeed.
-
2.
The magnitude of the continuity depends on the metric which measures the distance between and . In a specific case when , (4.4) reduces to
(4.5) where . The exponent plays an important role in (4.5) because it interacts with the tails of and . Moreover, if , then (4.5) is finite. We will come back to this later.
- 3.
- 4.
Example 4.1 (-th moment functional)
Lemma 4.1
Let and be a set of functions from to , i.e.,
(4.9) |
where for all and . Then
(4.10) |
where is defined by (3.3).
Before presenting a proof, it might be helpful for us to explain why we consider a specific set of functions . For fixed , let denote the set of all empirical laws over , then . Then may be regarded as a set of functions derived from a class of Lipschitz continuous functional on with and (by writing as a function of samples). Lemma 4.1 says that for any , the discrepancy between and under the metric can be bounded by .
Proof. Let , and . For any and any , denote
and
Then
Let denote the set of functions generated by . By the definition of and the -th order Forter-Mourier metric,
(4.11) | |||||
where the inequality is due to and the definition of . Finally, by the triangle inequality of the pseudo-metric, we have
The proof is complete.
With the intermediate technical result, we are now ready to present our main result of quantitative statistical robustness for the plug-in estimator of a general risk functional.
Theorem 4.2
Let be a general statistical functional and be a subset of with . Assume, for fixed , there exists a positive constant such that
(4.12) |
where and are given by (2.3) and (2.4) respectively. Then is quantitatively robust on w.r.t. , i.e.,
(4.13) |
If (4.12) holds for all , then the whole sequence of the plug-in estimators is quantitatively robust on , i.e., (4.13) holds for all . Moreover, in the case when , (4.13) reduces to
(4.14) |
Proof. Since the underlying probability space is atomless, then for any , by definition
(4.15) | |||||
where we write for and for to indicate its dependence on .
For any , (4.12) ensures that
which means that is locally Lipschitz continuous in , i.e., from (3.4). Since (see Example 3.2(i) and Proposition 3.1(i)), then (4.15) is finite. The rest follows from Lemma 4.1 by setting .
From Example 3.1, we have for all , then we have the following corollary.
Corollary 4.1
Let be a general statistical functional. Assume that is Lipschitz continuous w.r.t. () on for the constant . Then the plug-in estimator sequence is quantitatively robust on w.r.t. , i.e.,
for all .
Next, we take a step further to consider the index of quantitative robustness for a general statistical functional.
Definition 4.6 (Index of quantitative robustness)
Let be a general statistical functional. If is Lipschitz continuous w.r.t. on for the constant for some , then we can define an index of quantitative robustness of a statistical functional as
(4.16) |
This index is a quantitative measurement for the degree of robustness of a statistical functional. A larger index reflects a higher degree of robustness. For a general statistical functional , (4.5) may hold for uncountable many , see e.g., the -th moment functional satisfying (4.5) for any on . From Definition 4.6, we conclude that the -th moment functional has the index . Definition 4.6 coincides with the index of qualitative robustness proposed by Krätschmer et al. [7] when is Lipschitz continuous w.r.t. on . The main advantage of Definition 4.6 is that it is easy to calculate and we will illustrate this in the next section.
5 Application to risk measures
As we discussed in Proposition 2.1, law invariant risk measure of a random variable can be represented as a composition of a risk functional and law of the random variable. In practice, risk of a random variable is often calculated with empirical data, this is because either the true probability distribution is unknown or it might be prohibitively expensive to calculate the risk of a random variable with the true probability distribution. This raises a question as to whether the estimated risk measure based on empirical data is reliable or not. In this section, we apply the quantitative robustness results established in Theorem 4.2 to some well-known risk measures. The next proposition synthesizes Proposition 2.1 and Theorem 4.2.
Proposition 5.1
In what follows, we verify condition (5.1) for some well-known risk measures and hence show that they satisfy the proposed quantitative statistical robustness (5.2). To make the notation easily, we introduce the law invariant risk measure on the space of probability distributions.
Example 5.1
The expectation of given by satisfies
Let , where is the empirical distribution of . Then for any and any ,
(5.3) |
and the index of quantitative robustness .
Example 5.2
Consider the conditional value-at-risk of a probability distribution at level , which is defined by
Then
the last inequality is due to the fact that holds for all .
Let , where is the empirical distribution of . Then for any and any ,
(5.4) |
and the index of quantitative robustness for .
Example 5.3
The upper semi-deviation of a measure , which is defined by
satisfies
Let , where is the empirical distribution of . Then for any and any ,
(5.5) |
and the index of quantitative robustness .
Example 5.4
The Optimized Certainty Equivalent (OCE) [31] of is given by
where is a proper concave and non-decreasing utility function satisfying the normalized property: and , where denotes the subdifferential map of . By the essential of [31, Proposition 2.1], we have
where and . Let . Then is a convex risk measure [31] and
where denotes the left derivative of at and the last inequality is due to the fact that is non-decreasing and concave, subsequently, is non-increasing.
Let , where is the empirical distribution of . We consider two interesting cases.
One is that , in which case
(5.6) |
for any and any and the index of quantitative robustness for this case is .
The other is that there exists some positive number and positive constant such that , where . In that case, we have
(5.7) |
and the index of quantitative robustness for this case is .
To see how (5.6) and (5.7) could possibly be satisfied, we consider two specific utility functions: piecewise linear utility function and quadratic utility function, both of which are extracted from [31].
(a) Piecewise linear utility function with , where and . A simple calculation yields
Thus for any and any ,
(5.8) |
and and the index of quantitative robustness .
(b) Quadratic utility with . It is easy to observe that the function is locally Lipschitz continuous over with modulus being bounded by . Thus
Moreover, if , then . Subsequently,
where . Thus for any and any ,
(5.9) |
provided that and the index of quantitative robustness .
Example 5.5
Suppose that is an increasing convex loss function which is not identically constant. Let be an interior point in the range of . The Shortfall Risk Measure [18] of is defined by
(5.10) |
Following a similar analysis to Guo and Xu [32], we can recast the formulation above as
(5.11) |
Swapping the inf and sup operations, we can obtain the Lagrange dual of the problem. Moreover, if we assume that the inequality constraint in (5.10) satisfies the well-known Slater condition, i.e., there exists such that , then the Lagrange multipliers of (5.10) is bounded and the strong duality holds. Consequently, we can rewrite (5.11) as
(5.12) |
where are some positive numbers. By the essential of [31, Proposition 2.1], we have
where and . Subsequently,
where denote the right derivative of at and the last three inequalities are due to the fact is non-decreasing convex, subsequently, is non-decreasing.
Let , where is the empirical distribution of . If , then for any and any ,
(5.13) |
If there exists some positive number and positive constant such that , where , then
(5.14) |
In what follows, we illustrate the above two inequalities with two specific loss functions: deposit insurance loss function [33] and -th power loss function [18].
(a) Deposit insurance loss function, , where . Then . Thus, for any and any ,
(5.15) |
and the index of quantitative robustness is 1.
(b) For , we consider the -th power loss function,
where . We have for and for . Then, if , then and subsequently . Thus for any and any ,
(5.16) |
provided that and the index of quantitative robustness is .
In all of the above examples, the risk measures can either be represented explicitly in the form of (such as Expectation) or be obtained from solving an optimization problem where the underlying functions are represented in the expected utility form (CVaR, Certainty Equivalent and Shortfall risk measure), this is because the utility (disutility) functions are assumed to be concave (convex) and hence locally Lipschitz continuous. When growth of the Lipschitz modulus is controlled by , these risk measures satisfy inequality (4.5) as we have shown. This may not work for the spectral risk measures [34] with unbounded risk spectrum because the latter distort the probability distribution . However, when the risk spectrum is bounded (such as CVaR which is a special case of spectral risk measure), we can still manage inequality (4.5). This explains why we haven’t included spectral risk measures in the examples.
References
- [1] V. Krätschmer, A. Schied, and H. Zähle, “Comparative and qualitative robustness for law-invariant risk measures,” Finance and Stochastics, vol. 18, no. 2, pp. 271–295, 2014.
- [2] R. Cont, R. Deguest, and G. Scandolo, “Robustness and sensitivity analysis of risk measurement procedures,” Quantitative finance, vol. 10, no. 6, pp. 593–606, 2010.
- [3] H. Zähle, “Rates of almost sure convergence of plug-in estimates for distortion risk measures,” Metrika, vol. 74, no. 2, pp. 267–285, 2011.
- [4] D. Belomestny and V. Krätschmer, “Central limit theorems for law-invariant coherent risk measures,” Journal of Applied Probability, vol. 49, no. 1, pp. 1–21, 2012.
- [5] E. Beutner and H. Zähle, “A modified functional delta method and its application to the estimation of risk functionals,” Journal of Multivariate Analysis, vol. 101, no. 10, pp. 2452–2463, 2010.
- [6] F. R. Hampel, “A general qualitative definition of robustness,” The Annals of Mathematical Statistics, pp. 1887–1896, 1971.
- [7] V. Krätschmer, A. Schied, and H. Zähle, “Qualitative and infinitesimal robustness of tail-dependent statistical functionals,” Journal of Multivariate Analysis, vol. 103, no. 1, pp. 35–47, 2012.
- [8] H. Zähle, “Qualitative robustness of von mises statistics based on strongly mixing data,” Statistical Papers, vol. 55, no. 1, pp. 157–167, 2014.
- [9] H. Zähle et al., “Qualitative robustness of statistical functionals under strong mixing,” Bernoulli, vol. 21, no. 3, pp. 1412–1434, 2015.
- [10] G. Boente, R. Fraiman, V. J. Yohai, et al., “Qualitative robustness for stochastic processes,” The Annals of Statistics, vol. 15, no. 3, pp. 1293–1312, 1987.
- [11] K. Strohriegl and R. Hable, “Qualitative robustness of estimators on stochastic processes,” Metrika, vol. 79, no. 8, pp. 895–917, 2016.
- [12] P. J. Huber and E. M. Ronchetti, Robust statistics. Springer, 2011.
- [13] H. Zähle, “A definition of qualitative robustness for general point estimators, and examples,” Journal of Multivariate Analysis, vol. 143, pp. 12–31, 2016.
- [14] F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel, Robust statistics: the approach based on influence functions, vol. 196. John Wiley & Sons, 2011.
- [15] R. A. Maronna, R. D. Martin, V. J. Yohai, and M. Salibián-Barrera, Robust statistics: theory and methods (with R). John Wiley & Sons, 2019.
- [16] S. Guo and H. Xu, “Statistical robustness in utility preference robust optimization models,” Submitted to Mathematical Programming, 2020.
- [17] D. Filipović and G. Svindland, “The canonical model space for law-invariant convex risk measures is l1,” Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, vol. 22, no. 3, pp. 585–589, 2012.
- [18] H. Föllmer and A. Schied, “Convex measures of risk and trading constraints,” Finance and stochastics, vol. 6, no. 4, pp. 429–447, 2002.
- [19] P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath, “Coherent measures of risk,” Mathematical finance, vol. 9, no. 3, pp. 203–228, 1999.
- [20] H. Föllmer and S. Weber, “The axiomatic approach to risk measures for capital determination,” Annual Review of Financial Economics, vol. 7, pp. 301–337, 2015.
- [21] E. Delage, D. Kuhn, and W. Wiesemann, ““dice”-sion–making under uncertainty: When can a random decision reduce risk?,” Management Science, vol. 65, no. 7, pp. 3282–3301, 2019.
- [22] M. Frittelli, M. Maggis, and I. Peri, “Risk measures on and value at risk with probability/loss function,” Mathematical Finance, vol. 24, no. 3, pp. 442–463, 2014.
- [23] D. Dentcheva and A. Ruszczyński, “Risk preferences on the space of quantile functions,” Mathematical Programming, vol. 148, no. 1-2, pp. 181–200, 2014.
- [24] W. B. Haskell, W. Huang, and H. Xu, “Preference elicitation and robust optimization with multi-attribute quasi-concave choice functions,” arXiv preprint arXiv:1805.06632, 2018.
- [25] A. L. Gibbs and F. E. Su, “On choosing and bounding probability metrics,” International statistical review, vol. 70, no. 3, pp. 419–435, 2002.
- [26] I. Mizera et al., “Qualitative robustness and weak continuity: the extreme unction,” Nonparametrics and robustness in modern statistical inference and time series analysis: a Festschrift in honor of Professor Jana Jurecková, vol. 1, p. 169, 2010.
- [27] W. Römisch, “Stability of stochastic programming problems,” Handbooks in operations research and management science, vol. 10, pp. 483–554, 2003.
- [28] G. C. Pflug and A. Pichler, “Approximations for probability distributions and stochastic optimization problems,” in Stochastic optimization methods in finance and energy, pp. 343–387, Springer, 2011.
- [29] S. T. Rachev, Probability metrics and the stability of stochastic models, vol. 269. John Wiley & Son Ltd, 1991.
- [30] P. Mattila, Geometry of sets and measures in Euclidean spaces: fractals and rectifiability. No. 44, Cambridge university press, 1999.
- [31] A. Ben-Tal and M. Teboulle, “An old-new concept of convex risk measures: The optimized certainty equivalent,” Mathematical Finance, vol. 17, no. 3, pp. 449–476, 2007.
- [32] S. Guo, H. Xu, and L. Zhang, “Convergence analysis for mathematical programs with distributionally robust chance constraint,” SIAM Journal on optimization, vol. 27, no. 2, pp. 784–816, 2017.
- [33] C. Chen, G. Iyengar, and C. C. Moallemi, “An axiomatic approach to systemic risk,” Management Science, vol. 59, no. 6, pp. 1373–1388, 2013.
- [34] C. Acerbi, “Spectral measures of risk: A coherent representation of subjective risk aversion,” Journal of Banking & Finance, vol. 26, no. 7, pp. 1505–1518, 2002.
- [35] S. Kusuoka, “On law invariant coherent risk measures,” in Advances in mathematical economics, pp. 83–95, Springer, 2001.
Appendix A
Lemma A.1
Let and be two non-decreasing sequences. Then for any permutation of , we have
Proof. The result is perhaps well known. We include a proof as we cannot find a reference. We do so by induction.
For , the statement is trivial and for , for any and . Assume that the conclusion holds for . Then for , we have for any non-decreasing sequences and and any permutation of of , there exists a such that . If , then from induction hypothesis for , we have
If , then we have
where the first inequality is from induction hypothesis for to the non-decreasing sequences and and the second inequality is due to induction hypothesis for to the non-decreasing sequences and .
Proposition A.1
Let be a sequence of numbers and be a sequence of non-negative numbers. If , , then
See e.g. [35, Proposition 12].
Appendix B
Example B.1
In this example, we show that both inclusions in (3.12) are strict. We first show that , i.e., there exists a such that . Let be a unbounded -shaped function. Then by the continuity of , there exist and with . Let
Since for all outside , then is well-defined on . By the monotonicity and unboundedness of , we have . Moreover, since
then . However, by change of variables in integration, we have
which means .
Now we show that , i.e., there exists a such that . Let be an unbounded -shaped function. Then there exists an unbounded -shaped function such that . More precisely, for any , there exists an unbounded -shaped function such that
(B.1) |
We construct such as follows: since is an unbounded -shape function, then there exist and with . Let
Then is an unbounded -shaped function and satisfies (B.1). Let
Since for all , then is well-defined on . By the monotonicity and unboundedness of , we have .
For fixed , by change of variables in integration, we have
Since for , is bounded on , then . Thus, . However,
which means .