On Risk Evaluation and Control of Distributed Multi-Agent Systems

Aray Almen
Department of Mathematical sciences
Stevens Institute of Technology
Hoboken, NJ 07030, USA
aalmen@stevens.edu
&Darinka Dentcheva
Department of Mathematical sciences
Stevens Institute of Technology
Hoboken, NJ 07030, USA
darinka.dentcheva@stevens.edu

Abstract

In this paper, we deal with risk evaluation and risk-averse optimization of complex distributed systems with general risk functionals. We postulate a novel set of axioms for the functionals evaluating the total risk of the system. We derive a dual representation for the systemic risk measures and propose a way to construct non-trivial families of measures by using either a collection of linear scalarizations or non-linear risk aggregation. The new framework facilitates risk-averse sequential decision-making by distributed methods. The proposed approach is compared theoretically and numerically to some of the systemic risk measurements in the existing literature.

We formulate a two-stage decision problem with monotropic structure and systemic measure of risk. The structure is typical for distributed systems arising in energy networks, robotics, and other practical situations. A distributed decomposition method for solving the two-stage problem is proposed and it is applied to a problem arising in communication networks. We have used this problem to compare the methods of systemic risk evaluation. We show that the proposed risk aggregation leads to less conservative risk evaluation and results in a substantially better solution of the problem at hand as compared to an aggregation of the risk of individual agents and other methods.

Keywords stochastic programming, risk of complex systems, risk measures for multivariate risk, distributed risk-averse optimization, optimal wireless information exchange

1 Introduction

Evaluation of the risk of a system consisting of multiple agents is one of the fundamental problems relevant to many fields. A crucial question is the assessment of the total risk of the system taking into account the risk of each agent and its contribution to the total risk. Another issue arises when the risk evaluation is based on confidential or proprietary information. There is extensive literature addressing the properties of risk measures and their use in finance. Our goal is to address situations related to robotics, energy systems, business systems, logistic problems, etc. The analysis in financial literature may not be applicable in such situations due to the heterogeneity of the sources of risk, the nature, and the complexity of relations in those systems. In many systems, the source of risk is associated with highly non-trivial aggregation of the features of its agents, which may not be available in an analytical form. For example, in automated robotic systems, the exchange of information may be limited or distorted due to the speed of operation, the distance in space between the agents, or other reasons. Another difficulty associated with the evaluation of risk arises when the risk of one agent stems from various sources of uncertainty of different nature. The question of how to aggregate those risk factors in one loss function does not have a straightforward answer.

The risk of one loss function can be evaluated using a coherent measure of risk such as Average Value-at-Risk, mean-semideviation or others. More traditional (non-coherent) measures of risk such as Value-at-Risk (VaR) are also very popular and frequently used. We refer to [14] for an extensive treatment of risk measures for scalar-valued random variables, as well as to [31] where risk-averse optimization problems are analyzed as well.

The main objective of this paper is to suggest a new approach to the risk of a distributed system and show its viability and potential in application to risk-averse decision problems for distributed multi-agent systems. While building on the developments thus far, our goal is to identify a framework that is theoretically sound but also amenable to efficient numerical computations for risk-averse optimization of large multi-agent systems. We propose a set of axioms for functionals defined on the space of random vectors. The random vector is comprising risk factors of various sources, or is representing the loss of each individual agent in a multi-agent system. While axioms for random vectors have been proposed earlier, our set of axioms differs from those in the literature most notably with respect to the translation equivariance condition, which we explain in due course. The resulting systemic risk measures reduce to coherent measures of risk for scalar-valued random variables when the dimension of the random vectors becomes one. We derive the dual representation of the systemic measures of risk with less assumptions than known for multi-variate risks. In our derivation, we establish one-to-one correspondences between the axioms and properties of the dual variables. We also propose several ways to construct systemic risk measures and analyze their properties. The important features of the proposed measures are the following. They are conformant with the axioms; they can be calculated efficiently, and are amenable to distributed optimization methods.

We have formulated a risk-averse two-stage optimization problem with a structure, which is typical for a system of loosely coupled subsystems. The proposed numerical method is applied to manage the risk of a distributed operation of agents. The distributed method lets each subsystem optimize its operation with minimal information exchange among the other subsystems (agents). This aspect is important for multi-agent systems where some proprietary information is involved or when privacy concerns exist. The method demonstrates that distributed calculation of the systemic risk is possible without a big computational burden. We then consider a two-stage model in wireless communication networks, which extends the static model discussed in [21]. It addresses a situation when a team of robots explores an area and each robot reports relevant information. The goal is to determine a few reporting points so that the communication is conducted most efficiently while managing the risk of losing information. We conduct several numerical experiments to compare various systemic risk measures.

Our paper is organized as follows. In section 2 we provide preliminary information on coherent measures of risk for scalar-valued random variables and survey existing methods for risk evaluation of complex systems. Section 3 contains the set of axioms, the dual representation associated with the resulting systemic risk measures, and two ways to construct such measures in practice. Section 4 provides a theoretical comparison of the new measures of risk to other notions. In particular, we discuss other sets of axioms, explore relations to two notions of multivariate Average Value-at-Risk, and pay attention to the effect of the aggregation of risk before and after risk evaluation. In section 5, we formulate a risk-averse two-stage stochastic programming problem modeling wireless information exchange and seeking to locate a constraint number of information exchange points. We devise a distributed method for solving the problem and report a numerical comparison with several measures of risk, and other systemic measures. We pay attention to the comparison between the principles of aggregation for the purpose of total risk evaluation.

2 Preliminaries

2.1 Coherent risk measures

The widely accepted axiomatic framework for coherent measures of risk was proposed in [2] and further analyzed in [8], [14], [20], [29, 30], [25] and many others works. It is worth noting that another axiomatic approach was initiated in [18] and this line of thinking was developed to an entire framework in [27]. For a detailed exposition, we refer to [31] and the references therein. Let $\mathcal{L}_{p}(\Omega,\mathcal{F},P)$ be the space of real-valued random variables, defined on the probability space $(\Omega,\mathcal{F},P)$ , that have finite $p$ -th moments, $p\in[1,\infty)$ , and are indistinguishable on events with zero probability. We shall assume that the random variables represent random costs or losses. A lower semi-continuous functional $\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\}$ is a coherent risk measure if it is convex, positively homogeneous, monotonic with respect to the a.s. comparison of random variables, and satisfies the the following translation property

\varrho[X+a]=\varrho[X]+a\text{ for all }X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P),\;a\in\mathbb{R}.

If $\varrho[\cdot]$ is monotonicity, convex, and satisfies the translation property, then it is called a convex risk measure. Some examples of coherent measures of risk include Average Value-at-Risk (also called Conditional Value-at-Risk) and mean-semideviations measure, which are defined as follows. The Average Value-at-Risk at level $\alpha$ for a random variable $Z$ is defined as

\operatorname{AVaR}_{\alpha}[Z]=\inf_{\eta\in\mathbb{R}}\Big{\{}\eta+\frac{1}{\alpha}\mathbb{E}\big{[}(Z-\eta)_{+}\big{]}\Big{\}}

It is a special case of the higher-order measures of risk:

\varrho[Z]=\min_{t\in\mathbb{R}}\bigg{\{}t+\frac{1}{\alpha}\big{\|}\big{(}Z-t\big{)}_{+}\big{\|}_{p}\bigg{\}},\quad\alpha\in(0,1),

where $\|\cdot\|_{p}$ refers to the norm in $\mathcal{L}_{p}(\Omega,\mathcal{F},P)$ . The mean semi-deviation of order $p$ is given by

\varrho[Z]=\mathbb{E}[Z]+\varkappa\big{\|}\big{(}Z-\mathbb{E}[Z]\big{)}_{+}\big{\|}_{p},\quad\varkappa\in[0,1].

The space $\mathcal{L}_{p}(\Omega,\mathcal{F},P)$ equipped with its norm topology is paired with the space $\mathcal{L}_{q}(\Omega,\mathcal{F},P)$ equipped with the weak^∗ topology where $\frac{1}{p}+\frac{1}{q}=1$ . For any $Z\in\mathcal{Z}$ and $\xi\in\mathcal{Z}^{*}$ , we use the bilinear form:

\langle\xi,Z\rangle=\int_{\Omega}\xi(\omega)Z(\omega)dP(\omega).

The following result is known as a dual representation of coherent measures of risk. A proper lower semicontinuous coherent risk measure $\varrho$ has a dual representation

\varrho[Z]=\sup_{\xi\in\mathcal{A}_{\varrho}}\langle\xi,Z\rangle,\quad Z\in\mathcal{Z},

(1)

where $\mathcal{A}_{\varrho}\subset\{\xi\in\mathcal{Z}^{*}~{}|~{}\xi\geq 0,~{}\int_{\Omega}\xi(\omega)P(d\omega)=1\}$ is the convex-analysis subdifferential $\partial\varrho[0]$ .

Risk measures have also been defined by specifying a set of desired values for the random quantity in question; this set is called an acceptance set. Denoting the acceptance set by $\mathcal{K}\subset\mathbb{R}$ , the risk of a random outcome $Z$ is defined as:

\varrho_{\mathcal{K}}[X]=\inf\{m\in\mathbb{R}~{}|~{}X-m\in\mathcal{K}\}.

(2)

In finance, this notion of risk is interpreted as the minimum amount of capital that needs to be invested to make the final position acceptable. It is easy to verify that $\varrho[\cdot]$ in (2) is a coherent measure if and only if $\mathcal{K}$ is a convex cone (cf. [13]).

2.2 Risk measures for complex systems

As the risk is not additive, when we deal with distributed complex systems, we need to address the question of risk evaluation for the entire system. This risk is usually called systemic in financial literature and the proposed measures for its evaluation are termed systemic risk measures.

Assume that the system consists of $m$ agents. One approach to evaluating the risk of a system is to use an aggregation function, $\Lambda:\mathbb{R}^{m}\to\mathbb{R}$ , and univariate risk measures. Let $X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ be an $m$ -dimensional random vector comprising the costs incurred by the system, where each component $X_{i}$ corresponds to the costs of one agent. The first approach to systemic risk is to choose a univariate risk measure $\varrho_{0}$ and apply it to the aggregated cost $\Lambda(X).$ If we prefer to use an acceptance set $\mathcal{K}$ as in (2), the systemic risk can be defined as:

\displaystyle\varrho[X]=\varrho_{0}[\Lambda(X)]=\inf\big{\{}z\in\mathbb{R}~{}|~{}\Lambda(X)-z\in\mathcal{K}\big{\}}.

(3)

In ([7] this point of view is analyzed in finite probability spaces and it is shown that any monotonic, convex, positively homogeneous function provides a risk evaluation as in (3) as long as it is consistent with the preferences represented in the definition of $\mathcal{K}$ . The point of view presented in definition (3) is further extended in [11], where the authors analyzed convex risk measures defined on a general measurable space and proposed examples of aggregation functions suitable for a financial system. In both studies, the structural decomposition of the systemic risk measure (3) is established when the aggregation function $\Lambda$ satisfies properties similar to the axioms postulated for risk measures. In [4], the authors considered a particular case of an aggregation function, proposing an evaluation method for the risk associated with the cumulative externalities or costs endured by financial institutions. Note that these evaluation methods rely on a choice of one aggregation function suitable for a specific problem.
The translation property for constant vectors is introduced in [5] for convex risk measures defined for bounded random vectors. This property differs from the one we propose here. The authors analyzed the maximal risk over a class of aggregation functions rather than using one specific function. We refer to [28] for an overview of the risk measures constructed this way. A similar approach is taken in [10], where law-invariant risk measures for bounded random vectors are investigated for the purpose of obtaining a Kusuoka representation. The axioms proposed in [5, 10] are closest to ours and we provide more detailed discussion in section 3.

Another approach to risk evaluation of complex systems consists of evaluation of the risk of individual agents first and aggregation of the obtained values next. This method is used, for example, in [3] and [12]. Using the notion of acceptance sets the systemic risk measure is defined in [3] in the following way:

\displaystyle\varrho[X]

\displaystyle=\varrho_{0}[\Lambda(X)]=\inf\Big{\{}\sum_{i=1}^{m}z_{i}~{}|~{}z\in\mathbb{R}^{m},~{}\Lambda(X-z)\in\mathcal{K}\subset\mathbb{R}^{m}\Big{\}}.

The proposed measures of risk in section 3 also accommodate this point of view. A further extension in [3] replaces the constant vector $z\in\mathbb{R}^{m}$ by a random vector $Y\in\mathcal{C}$ , where $\mathcal{C}$ is a given set of admissible allocations. This formulation of the risk measure allows to decide scenario-dependent allocations, where the total amount $\sum_{i=1}^{m}z_{i}$ can be determined ahead of time while individual allocations $z_{i}$ may be decided in the future when uncertainty is revealed. In [12] a set-valued counterpart of this approach is proposed by defining the systemic risk measure as the set of all vectors that make the outcome acceptable. Once the set of all acceptable allocations is constructed, one can derive a scalar-valued efficient allocation rule by minimizing the weighted sum of components of the vectors in the set. Set-valued risk measures were proposed in [17], see also [1, 16] for duality theory including the dual representation for certain set-valued risk measures. In fast majority of literature, the systemic risk depends on the choice of the aggregation function $\Lambda$ and how well it captures the interdependence between the components. To capture the dependence, an approach based on copula theory was put forward in [24]. It is assumed that independent operation does not carry systemic risk and, hence, the local risk can be optimized by each agent independently. The systemic risk measures are then constructed based on the copulas of the distributions.

Another line of work includes methods that use some multivariate counterpart of the univariate risk measures. The main notion here is the Multivariate Value-at-Risk ( $\operatorname{MVaR}$ ) for random vectors, which is identified with the set of $p$ -efficient points. Let $F_{X}(\cdot)$ be the right-continuous distribution function of a random vector $X$ with realizations in $\mathbb{R}^{m}$ . A $p$ -efficient points for $X$ is a point $v\in\mathbb{R}^{m}$ such that $F_{X}(v)\geq p$ and there is no point $z$ that satisfies $F_{X}(z)\geq p$ with $z\leq v$ componentwise. This notion plays a key role in optimization problems with chance constraints (see e.g. [31]). Multivariate Value-at-Risk satisfies the properties of translation equivariance, positive homogeneity and monotonicity. This notion is used to define Average Value-at-Risk for multivariate distributions ( $\operatorname{MAVaR}$ ) in [19, 23, 26]. Let ${Z}_{p}$ be the set of all points, each of which is component-wise larger than some $p$ -efficient point:

Z_{p}=\bigcup_{s\in\operatorname{MVaR}_{p}(X)}(s+\mathbb{R}^{m}_{+}).

In [19], Lee and Prekopa define the $\operatorname{MAVaR}$ of a random vector $X$ at level $p$ as

\operatorname{MAVaR}_{p}(X)=\mathbb{E}(\Lambda(X)~{}|~{}X\in D_{p}\},

(4)

where $\Lambda$ is assumed integrable with respect to $F_{X}$ , i.e., $\mathbb{E}(\Lambda(X))$ is finite. It is shown in [19] that $\operatorname{MAVaR}$ is translation equivariant, positive homogeneous and subadditive only when all of the components of the random vector are independent.

While the definition of $\operatorname{MAVaR}$ above is scalar-valued, in [22] the authors define a Multivariate Average Value-at-Risk ( $\operatorname{MAVaR}$ ) using the notion of $p$ -efficient points as $\operatorname{MVaR}_{p}(X)$ and the extremal representation of the Average Value-at-Risk. First for given probability $p\in(0,1)$ , we consider the vectors

\operatorname{MAVaR}_{p}(X;v)=v+\frac{1}{p}\mathbb{E}[(X-v)_{+}],

where $[(X-v)_{+}]_{i}=\max(0,X_{i}-v_{i})$ , $i=1,\dots,m$ . Then, the following vector-optimization problem is solved:

\operatorname{VMAVaR}_{p}(X)=\min\{\operatorname{MAVaR}_{p}(X;v):v\in\operatorname{MVaR}_{p}(X)\}.

The vector-valued Multivariate Average Value-at-Risk is monotonic, positively homogeneous, translation equivariant, but is not subadditive. Note that in both $\operatorname{MVaR}$ and $\operatorname{MAVaR}$ , one needs to use a scalarization function to obtain a scalar value for the risk.

We shall compare our proposal to the aforementioned risk measures in section 4.

3 Axiomatic Approach to Risk Measures for Random Vectors

In this section, we propose a set of axioms to measures of risk for random vectors with realizations in $\mathbb{R}^{m}$ . This framework is analogous to the coherent risk measures properties for scalar-valued random variables. In fact, if $m=1$ , the proposed set of axioms exactly coincides with those in [31]. We denote by $\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ be the space of random vectors with realizations in $\mathbb{R}^{m}$ , defined on $(\Omega,\mathcal{F},P)$ . Throughout the paper, we shall consider risk measure $\varrho$ for random vectors in $\mathcal{Z}$ to be a lower-semi-continuous functional $\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\}$ with non-empty domain. We denote the $m$ -dimensional vector, whose components are all equal to one by $\mathbf{1}$ and the random vector with realizations equal to $\mathbf{1}$ by $\mathbb{I}$ .

Definition 1.

A lower semi-continuous functional $\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\}$ is a coherent risk measure with preference to small outcomes, iff it satisfies the following axioms:

A1.

Convexity: For all $X,Y\in\mathcal{Z}$ and $\alpha\in(0,1)$ , we have:

$~{}\varrho[\alpha X+(1-\alpha)Y]\leq\alpha\varrho[X]+(1-\alpha)\varrho[Y].$
A2.

Monotonicity: For all $X,Y\in\mathcal{Z}$ , if $X_{i}\geq Y_{i}$ for all components $i=1,\dots,m$ $P$ -a.s., then $\varrho[X]\geq\varrho[Y]$ .
A3.

Positive homogeneity: For all $X\in\mathcal{Z}$ and $t>0$ , we have $\varrho[tX]=t\varrho[X]$ .
A4.

Translation equivariance: For all $X\in\mathcal{Z}$ and $a\in\mathbb{R}$ , we have $\varrho[X+a\mathbb{I}]=\varrho[X]+a\varrho[\mathbb{I}].$

A lower semi-continuous functional $\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\}$ is a convex risk measure with preference to small outcomes, iff it satisfies axioms A1, A2, and A4.

The axioms of convexity and positive homogeneity are defined in a similar way to the properties of coherent risk measures, while the random vectors are now compared component-wise for the property of monotonicity. The main difference is the definition of a translation equivariance axiom. It suggests that if the random loss increases by a constant amount for all components, then the risk should also increase by the same amount. These axioms differ from the previous axioms proposed in the literature.

3.1 Dual representation

In order to derive a dual representation of the multivariate risk measure, we pair the space of random vectors $\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ , $p\in[1,\infty)$ with the space $\mathcal{Z}^{*}=\mathcal{L}_{q}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ , where $q\in(1,\infty)$ is such that $\frac{1}{p}+\frac{1}{q}=1$ , $q=\infty$ for $p=1$ . For $X\in\mathcal{Z}$ and $\zeta\in\mathcal{Z}^{*}$ the bilinear form $\langle\cdot,\cdot\rangle$ on the product space $\mathcal{Z}\times\mathcal{Z}^{*}$ is defined as follows:

\langle\zeta,X\rangle=\int_{\Omega}\langle\zeta(\omega),X(\omega)\rangle dP(\omega).

The Fenchel conjugate function $\varrho^{*}:\mathcal{Z}^{*}\to\mathbb{R}\cup\{+\infty\}$ of the risk measure $\varrho$ is given by

\varrho^{*}[\zeta]=\sup_{X\in\mathcal{Z}}\big{\{}\langle\zeta,X\rangle-\varrho[X]\big{\}},

and the conjugate of $\varrho^{*}$ (the bi-conjugate function) is

\varrho^{**}[X]=\sup_{\zeta\in\mathcal{Z}^{*}}\big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\big{\}}.

Fenchel-Moreau theorem implies that if $\varrho[\cdot]$ is convex and lower semicontinuous, then $\varrho^{**}=\varrho$ and that

\varrho[X]=\sup_{\zeta\in\tilde{\mathcal{A}}}\big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\big{\}}

(5)

where $\tilde{\mathcal{A}}=\operatorname{dom}(\varrho^{*})$ is the domain of the conjugate function $\varrho^{*}$ . Then based on the Fenchel-Moreau theorem and the axioms proposed in this paper, we show the following theorem.

Theorem 1.

Suppose $\varrho:\mathcal{Z}\to\bar{\mathbb{R}}\cup\{+\infty\}$ is a convex and lower semicontinuous risk functional. Then the following holds:

(i)

Property A2 is satisfied if and only if $\zeta\geq 0$ a.s. for all $\zeta$ in the domain of $\varrho^{*}$ .
(ii)

Property A3 is satisfied if and only if $\varrho^{*}$ is the indicator function of $\partial\varrho[0]$ , i.e.

$\varrho[X]=\sup_{\zeta\in\mathcal{A}}\{\langle\zeta,X\rangle\}.$ (6)
(iii)

Property A4 is satisfied if and only if $\varrho[\mathbb{I}]=\langle\mathbb{I},\mu_{\zeta}\rangle$ for all $\zeta\in\mathcal{A}$ , where $\mu_{\zeta}=\int_{\Omega}\zeta(\omega)P(d\omega)$ .

Proof.

Since $\varrho[\cdot]$ is convex and lower semicontinuous and we have assumed that it has non-empty domain, the representation (5) holds by virtue of the Fenchel-Moreau theorem.

(i) Suppose $\varrho$ satisfies the monotonicity condition. Assume that $\zeta_{i}(\omega)<0$ for $\omega\in\Delta\in\mathcal{F}$ with $P(\Delta)>0$ for some component $i=1,\dots,m$ . Define $\bar{X}_{i}$ equal to the indicator function of the event $\Delta$ and $\bar{X}_{j}=0$ for $j\neq i,j=1,\dots,m$ . Take any $X$ with support in $\Delta$ such that $\varrho[X]$ is finite and define $X_{t}:=X-t\bar{X}$ . Then for $t\geq 0$ , we have that $X\succeq X_{t}$ , and $\varrho[X]\geq\varrho[X_{t}]$ by monotonicity. Consequently,

	$\displaystyle\varrho^{*}[\zeta]$	$\displaystyle\geq\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X_{t}\rangle-\varrho[X_{t}]\Big{\}}\geq\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X\rangle-t\langle\zeta,\bar{X}\rangle-\varrho[X]\Big{\}}$
		$\displaystyle=\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X\rangle-t\int_{\Delta}\zeta_{i}(\omega)P(d\omega)-\varrho[X]\Big{\}}=+\infty.$

It follows that $\varrho^{*}[\zeta]=+\infty$ for every $\zeta\in\mathcal{Z}^{*}$ with at least one negative component, thus $\zeta\notin\operatorname{dom}\varrho^{*}$ . Conversely, suppose that $\zeta\in\mathcal{Z}^{*}$ has realizations in $\mathbb{R}^{m}$ with nonnegative components $P$ -a.s. Then whenever $X\succeq X^{\prime}$ , we have:

\displaystyle\langle\zeta,X\rangle

\displaystyle=\int_{\Omega}\langle\zeta(\omega),X(\omega)\rangle dP(\omega)\geq\int_{\Omega}\langle\zeta(\omega),X^{\prime}(\omega)\rangle dP(\omega)=\langle\zeta,X^{\prime}\rangle.

Consequently,

\displaystyle\varrho[X]

\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\Big{\}}\geq\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X^{\prime}\rangle-\varrho^{*}(\zeta)\Big{\}}=\varrho[X^{\prime}].

Hence, the monotonicity condition holds.

(ii) Suppose the positive homogeneity property holds. If $\varrho[tX]=t\varrho[X]$ for all $X\in\mathcal{Z}$ , then for any fixed $t>0$ , we get

\varrho^{*}[\zeta]=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X\rangle-\varrho[X]\Big{\}}=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,tX\rangle-\varrho[tX]\Big{\}}=\sup_{X\in\mathcal{Z}}t\Big{\{}\langle\zeta,X\rangle-\varrho[X]\Big{\}}=t\varrho^{*}[\zeta].

Hence, if $\varrho^{*}[\zeta]$ is finite, then $\varrho^{*}[\zeta]=0$ as claimed. Conversely, if $\varrho[X]=\sup_{\zeta\in\operatorname{dom}\varrho^{*}}\langle\zeta,X\rangle$ , then $\varrho$ is positively homogeneous as a support function of a convex set.

(iii) Suppose the translation property is satisfied, i.e.
$\varrho[X+t\mathbb{I}]=\varrho[X]+t\varrho[\mathbb{I}]$ for any $X\in\mathcal{Z}$ and a constant $t\in\mathbb{R}$ . Then for any $k\in\mathbb{R}$ and $\zeta\in\mathcal{Z}^{*}$ , we get:

	$\displaystyle\varrho^{*}[\zeta]$	$\displaystyle=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X+k\mathbb{I}\rangle-\varrho[X+k\mathbb{I}]\Big{\}}=\sup_{X\in\mathcal{Z}}\Big{\{}\int_{\Omega}\langle\zeta(\omega),X(\omega)+k\mathbb{I}\rangle P(d\omega)-\varrho[X]-k\varrho[\mathbb{I}]\Big{\}}$
		$\displaystyle=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X\rangle+k\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho[X]-k\varrho[\mathbb{I}]\Big{\}}=\varrho^{*}[\zeta]+k\Big{(}\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho[\mathbb{I}]\Big{)}.$

If $\varrho^{*}[\zeta]$ is finite, then $\varrho[\mathbb{I}]=\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)$ .

Let us denote $\mu_{\zeta}=\int_{\Omega}\zeta(\omega)P(d\omega)$ , then we obtain

\varrho[\mathbb{I}]=\langle\mathbb{I},\int_{\Omega}\zeta(\omega)P(d\omega)\rangle=\langle\mathbb{I},\mu_{\zeta}\rangle\quad\text{for all }\zeta\in\mathcal{A}.

(7)

Conversely, suppose $\varrho[\mathbb{I}]=\langle\mathbb{I},\mu_{\zeta}\rangle$ . Then for any $X\in\mathcal{Z}$ and $k\in\mathbb{R}$ :

	$\displaystyle\varrho[X+k\mathbb{I}]$	$\displaystyle=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\langle\zeta,X+k\mathbb{I}\rangle-\varrho^{}[\zeta]\Big{\}}=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\int_{\Omega}\langle\zeta(\omega),X(\omega)+k\mathbb{I}\rangle P(d\omega)-\varrho^{}[\zeta]\Big{\}}$
		$\displaystyle=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\langle\zeta,X\rangle+k\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho^{}[\zeta]\Big{\}}=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\langle\zeta,X\rangle-\varrho^{}[\zeta]+k\varrho[\mathbb{I}]\Big{\}}$
		$\displaystyle=\varrho[X]+k\varrho[\mathbb{I}].$

Hence, the translation property is satisfied. ∎

It follows from Theorem 1 that if a risk measure $\varrho$ is lower semicontinuous and satisfies the axioms of monotonicity, convexity and translation equivariance, then representation (6) holds with the set $\mathcal{A}$ defined as:

\mathcal{A}=\Big{\{}\zeta\in\mathcal{Z}^{*}~{}:~{}\int_{\Omega}\zeta(\omega)dP(\omega)=\mu_{\zeta},~{}\zeta\succeq 0\Big{\}}.

Corollary 1.

If a risk measure $\varrho[\cdot]$ is coherent, then

\varrho[0]=0\quad\text{and}\quad\mathcal{A}=\partial\varrho[0].

Proof.

If $\varrho$ is also positive homogeneous, then $\varrho$ is the support function of $\mathcal{A}=\operatorname{dom}(\varrho^{*})$ . Then

\varrho[0]=\sup_{\zeta\in\mathcal{Z}^{*}}\bigg{\{}\langle 0,X\rangle-\varrho^{*}[\zeta]\bigg{\}}=0.

To show the form of the set $\mathcal{A}$ recall that

\displaystyle\partial\varrho[0]

\displaystyle=\{\zeta\in\mathcal{Z}^{*}:\langle\zeta,X-0\rangle\leq\varrho[X]-\varrho[0]\}=\{\zeta\in\mathcal{Z}^{*}:\langle\zeta,X\rangle\leq\varrho[X]\}.

Hence, for all $\zeta\in\mathcal{A}$ , (6) implies that $\zeta\in\partial\varrho[0]$ . On the other hand, if $\zeta\in\partial\varrho[0]$ , then $\zeta\in\mathcal{A}$ by the definition of a support function. ∎

We shall consider further the following property.

Normalization a coherent measure of risk $\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\}$ is normalized iff $\varrho[\mathbb{I}]=1$ .

Corollary 2.

For a normalized coherent measure of risk $\varrho[\cdot]$ , we have $\int_{\Omega}\langle\mathbf{1},\zeta\rangle P(d\omega)=1$ .

Proof.

It follows from equation (7) that $\langle\mathbb{I},\mu_{\zeta}\rangle=1$ , as stated. This entails that for all $\zeta\in\mathcal{A}$ , $\zeta P$ can be interpreted as a probability measure on the space $\Omega\times\{1,2,\dots,m\}$ . ∎

In the paper [5], the authors have adopted the following translation axiom:

T. For any constant $\alpha\in\mathbb{R}$ and any vector $e^{i}$ whose $i$ -th component is 1 ( $i=1,\dots,m$ ) and all other components are zero, we have $\varrho[X+\alpha e^{i}]=\varrho[X]+\alpha.$

Theorem 2.

Assume that $\varrho$ is a proper lower-semicontinuous convex risk functional. Property T holds if and only if $\int_{\Omega}\zeta_{i}(\omega)dP(\omega)=1$ for all $i=1,\dots,m$ for all $\zeta\in\operatorname{dom}\varrho^{*}$ . Furthermore, if property T holds, then

	$\displaystyle\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle dP(\omega)=\varrho[\mathbb{I}]=m,\quad\int_{\Omega}\zeta(\omega)dP(\omega)=\mathbf{1},\quad\text{for all }\zeta\in\operatorname{dom}\varrho^{*};$		(8)
	$\displaystyle\varrho[X+a]=\varrho[X]+\varrho[a]\quad\text{for all }X\in\mathcal{Z},\;a\in\mathbb{R}^{m}.$		(10)

Proof.

Suppose T holds. Then for a random vector $X$ in the domain of $\varrho$ and every $\zeta\in\mathcal{Z}^{*}$ , we have

	$\displaystyle\varrho^{*}[\zeta]\geq$	$\displaystyle\sup_{\alpha\in\mathbb{R}}\big{\{}\langle\zeta,X+\alpha e^{i}\rangle-\varrho[X+\alpha e^{i}]\big{\}}=\sup_{\alpha\in\mathbb{R}}\Big{\{}\int_{\Omega}\alpha\zeta_{i}(\omega)P(d\omega)+\langle\zeta,X\rangle-\varrho[X]-\alpha\Big{\}}$
		$\displaystyle=\sup_{\alpha\in\mathbb{R}}\alpha\Big{\{}\int_{\Omega}\zeta_{i}(\omega)P(d\omega)-1\Big{\}}+\langle\zeta,X\rangle-\varrho[X].$

It follows that $\varrho^{*}[\zeta]=+\infty$ for any $\zeta\in\mathcal{Z}^{*}$ such that $\int_{\Omega}\zeta_{i}(\omega)P(d\omega)\not=1$ . This entails that for every constant vector $a\in\mathbb{R}^{m}$ , the risk value is

\varrho[a]=\varrho\Big{[}\sum_{i=1}^{m}a_{i}e^{i}\Big{]}=\sum_{i=1}^{m}a_{i}.

(12)

The other direction is straightforward. Indeed,

	$\displaystyle\varrho[X+\alpha e^{i}]$	$\displaystyle=\sup_{\zeta\in\mathcal{Z}^{}}\big{\{}\langle\zeta,X+\alpha e^{i}\rangle-\varrho^{}[\zeta]\big{\}}=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\int_{\Omega}\alpha\zeta_{i}(\omega)P(d\omega)+\langle\zeta,X\rangle-\varrho^{}[\zeta]\Big{\}}$
		$\displaystyle=\sup_{\zeta\in\mathcal{Z}^{}}\Big{\{}\alpha+\langle\zeta,X\rangle-\varrho^{}[\zeta]\Big{\}}=\varrho[X]+\alpha.$

Additionally, property T also implies that

\int_{\Omega}\zeta(\omega)dP(\omega)=\mathbf{1},\quad\int_{\Omega}\langle\mathbf{1},\zeta(\omega)\rangle dP(\omega)=m=\varrho[\mathbb{I}].

Due to equation (12), for all $X\in\mathcal{Z}$ and $a\in\mathbb{R}$ , we obtain

\displaystyle\varrho[X+a]=\varrho[X+\sum_{i=1}^{m}a_{i}e^{i}]=\varrho[X]+\sum_{i=1}^{m}a_{i}=\varrho[X]+\varrho[a],

which completes the proof. ∎

We also observe that a particular implication of Theorem 2 is that risk measures are linear on constant vectors.

Corollary 3.

If a coherent measure of risk $\varrho[\cdot]$ satisfies property T, then it is linear on constant vectors.

Proof.

Indeed, a special case of (LABEL:e:T-randomsum) shows that

\varrho[a+b]=\sum_{i=1}^{m}(a_{i}+b_{i})=\varrho[a]+\varrho[b]\quad\text{for all }a,b\in\mathbb{R}^{m}

This combined with the fact that $\varrho[0]=0$ and the positive homogeneity of the risk measure proves the statement. ∎

In [10], the authors have analyzed law-invariant risk measures for bounded random vectors. They have introduced a set of axioms that are closest to ours: their axioms include our axioms together with the two normalization properties $\varrho[\mathbb{I}]=1$ and $\varrho[0]=0$ . We do not need these normalization properties to establish the dual representation for general random vectors with finite $p$ -moments, $p\geq 1$ ; we derive that the risk of the deterministic zero vector is zero from the dual representation. The property of strong coherence of risk measures, introduced in that paper implies in particular that $\varrho[a+b]=\varrho[a]+\varrho[b],$ which appears to be a strong assumption.

3.2 Risk measures obtained via sets of linear scalarizations

Suppose we have a random vector $X\in\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ with a right-continuous distribution function $F(X;\cdot)$ and marginal distribution function $F_{i}(X_{i};\cdot)$ of each component $i=1,\dots,m$ . We consider linear scalarization using vectors taken from the simplex

S_{m}^{+}=\{c\in\mathbb{R}^{m}~{}|~{}\sum_{i=1}^{m}c_{i}=1,~{}c_{i}\geq 0~{}\forall i=1,\dots,m\}.

Let $\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\}$ be a lower semi-continuous risk measure. For any fixed set $S\subset S_{m}^{+}$ , we define the risk measure

\varrho_{S}[X]=\varrho[X_{S}],\quad\text{where }X_{S}(\omega)=\max_{c\in S}c^{\top}X(\omega),\;\;\omega\in\Omega.

(13)

It is straightforward to see that $X_{S}\in\mathcal{L}_{p}(\Omega,\mathcal{F},P)$ and hence, the risk measure $\varrho_{S}[\cdot]$ is well-defined on $\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}).$

Theorem 3.

If $\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\}$ is a coherent (convex) risk measure, then for any set $S\subset S_{m}^{+}$ , the risk measure $\varrho_{S}[X]=\varrho[X_{S}]$ is coherent (convex) according to Definition 1.

Proof.

For two random vectors $X,Y\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ with $X\leq Y$ component-wise a.s., we have $c^{\top}X\leq c^{\top}Y$ a.s. for all $c\in S_{m}^{+}$ . This implies that $\max_{c\in S}c^{\top}X\leq\max_{c\in S}c^{\top}Y$ a.s. and, hence, $\varrho[X_{S}]\leq\varrho[Y_{S}]$ . Thus, the monotonicity axiom is satisfied. Given two random vectors $X,Y\in\mathcal{Z}$ and $\alpha\in[0,1]$ , consider their convex combination $\alpha X+(1-\alpha)Y$ . Due to the convexity and monotonicity of $\varrho[\cdot]$ , we have

	$\displaystyle\varrho_{S}[\alpha X+(1-\alpha)Y]$	$\displaystyle=\varrho[\max_{c\in S}c^{\top}(\alpha X+(1-\alpha)Y)]\leq\varrho[\alpha\max_{c\in S}c^{\top}X+(1-\alpha)\max_{c\in S}c^{\top}Y]$
		$\displaystyle\leq\alpha\varrho[\max_{c\in S}c^{\top}X]+(1-\alpha)\varrho[\max_{c\in S}c^{\top}Y]=\alpha\varrho_{S}[X]+(1-\alpha)\varrho_{S}[Y].$

Thus, the convexity axiom is satisfied. Given a random vector $X\in\mathcal{Z}$ and a constant $t\in\mathbb{R}$ , it follows:

\displaystyle\varrho_{S}[X]

\displaystyle=\varrho[\max_{c\in S}c^{\top}(X+t\mathbb{I})]=\varrho[\max_{c\in S}(c^{\top}X+tc^{\top}\mathbb{I})]=\varrho[\max_{c\in S}c^{\top}X+t]=\varrho[X_{S}]+t

Positive homogeneity follows in a straightforward manner. ∎

If the set $S$ is a singleton, we obtain the following.

Corollary 4.

Let $\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\}$ be a coherent (convex) risk measure. For any vector $c\in\mathcal{S}_{m}^{+}$ , the risk measure $\varrho_{c}[X]=\varrho[c^{\top}X]$ is coherent (convex) according to Definition 1.

Using the dual representation of the coherent risk measure $\varrho$ for scalar-valued random variables, we obtain the following:

\displaystyle\varrho[c^{\top}X]

\displaystyle=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)c^{\top}X(\omega)P(d\omega)=\sup_{\zeta\in\tilde{\mathcal{A}}}\int_{\Omega}\zeta(\omega)X(\omega)P(d\omega)\quad\text{with }\tilde{\mathcal{A}}=\{\xi c:\xi\in\mathcal{A}_{\varrho}\}

(14)

Additionally, a measurable selection $\nu_{X}(\omega)\in\operatorname{\arg\max}_{c\in S}c^{\top}X(\omega)$ exists by the Kuratowski-Ryll-Nadjevski theorem; we shall use the notation $\nu_{X}\in S$ for any such selection.

	$\displaystyle\varrho[\max_{c\in S}c^{\top}X]$	$\displaystyle=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)\max_{c\in S}c^{\top}X(\omega)P(d\omega)=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)\nu_{X}^{\top}X(\omega)P(d\omega)$
		$\displaystyle=\sup_{\zeta\in\tilde{\mathcal{A}}^{\prime}}\int_{\Omega}\zeta(\omega)X(\omega)P(d\omega)\quad\text{with }\tilde{\mathcal{A}}^{\prime}=\{\xi\nu_{X}:\xi\in\mathcal{A}_{\varrho}\}$

Notice that the representations just derived have the form of the dual representation in (6), however we have not established that $\tilde{\mathcal{A}}$ coincides with the domain of its conjugate function.

We observe the following properties of the aggregation by a single linear scalarization.

Proposition 1.

Given a coherent risk measure $\varrho:\mathcal{Z}\to\bar{\mathbb{R}}$ and a scalarization vector $c\in\mathcal{S}^{+}_{m}$ , for any random vector $X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ risk of the vector measured by $\varrho[c^{\top}X]$ does not exceed the maximal risk of its components measured by $\varrho[\cdot].$ Furthermore, the following relation between aggregation methods holds $\varrho[c^{\top}X]\leq c^{\top}\varrho[X].$

Proof.

The dual representation implies the following:

	$\displaystyle\varrho[c^{\top}X]$	$\displaystyle=\sup_{\xi\in\mathcal{A}}\int_{\Omega}\sum_{i=1}^{m}c_{i}\xi(\omega)X_{i}(\omega)P(d\omega)=\sup_{\xi\in\mathcal{A}}\sum_{i=1}^{m}c_{i}\int_{\Omega}\xi(\omega)X_{i}(\omega)P(d\omega)$
		$\displaystyle\leq\sum_{i=1}^{m}\sup_{\xi\in\mathcal{A}}c_{i}\int_{\Omega}\xi(\omega)X_{i}(\omega)P(d\omega)=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]\leq\max_{1\leq i\leq m}\varrho[X_{i}].$

∎

The penultimate relation implies the second claim of the theorem.

We also show the following useful result, which implies that we can use statistical methods to estimate the systemic risk measure $\varrho_{S}[X].$

Proposition 2.

If $\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\}$ is a law-invariant risk measure, then for any set $S\subset S_{m}^{+}$ , the systemic risk measure $\varrho_{S}[X]=\varrho[X_{S}]$ is law-invariant.

Proof.

It is sufficient to show that for two random vectors $X$ and $Y$ , which have the same distribution, the respective random variables $X_{S}$ and $Y_{S}$ have the same distribution. ∎

We observe that $c^{\top}X$ and $c^{\top}Y$ have the same distribution for any vector $c\in\mathbb{R}^{m}$ . Hence, for any real number $r$ , the following relations hold:

\displaystyle P(X_{S}\leq r)

\displaystyle=P(c^{\top}X\leq r,\;\forall c\in S)=P(c^{\top}Y\leq r,\;\forall c\in S)=P(Y_{S}\leq r),

which shows the equality of the distribution functions.

3.3 Systemic Risk Measures Obtained via Nonlinear Scalarization

The second aggregation method that falls within the scope of our axiomatic framework is that of nonlinear scalarization. This class of risk measures cannot be obtained within the framework of aggregations by non-linear functions, and does not fit the axiomatic approaches in [7] or in [5]. Furthermore, we shall see that this method of evaluating systemic risk allows to maintain fairness between the system’s participants.

We define $\Omega_{m}=\{1,\dots,m\}$ and consider a probability space $(\Omega_{m},\mathcal{F}_{c},c)$ , where $c\in S^{+}_{m}$ and $\mathcal{F}_{c}$ contains all subsets of $\Omega_{m}$ . We view $c$ as a probability mass function of the space $\Omega_{m}$ . Given an $m$ -dimensional random vector $X\in\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ and a collection of $m$ univariate measures of risk $\varrho_{i}:(\Omega,\mathcal{F},P)\to\mathbb{R}$ , $i=1,\dots,m,$ we define the random variable $X_{R}$ on the space $\Omega_{m}$ as follows:

X_{R}(i)=\varrho_{i}[X_{i}],\quad i=1,\dots m.

(15)

Choosing a scalar measure of risk $\varrho_{0}:(\Omega_{m},\mathcal{F}_{c},c)\to\mathbb{R}$ , the measure of systemic risk $\varrho_{s}:\mathcal{L}_{p}(\Omega_{m},\mathcal{F}_{c},c)\to\mathbb{R}$ is defined as follows:

\varrho_{s}[X]=\varrho_{0}[X_{R}].

(16)

This is a nonlinear aggregation of the individual risks $\varrho[X_{i}]$ , hence this approach falls within the category of methods that evaluate the risk of each component first and then aggregate their values. The measure $\varrho_{s}[X]$ satisfies the axioms postulated for systemic risk measures.

Theorem 4.

Suppose the univariate measures of risk $\varrho_{i}:(\Omega,\mathcal{F},P)\to\mathbb{R}$ , $i=1,\dots,m$ are coherent and let $\varrho_{s}[\cdot]$ be defined as in (16). Then $\varrho_{s}[\cdot]$ satisfies properties (A1)–(A4).

Proof.

(i) Given any $X,Y\in\mathcal{Z}$ and $\alpha\in(0,1)$ , we consider the random vector $Z=\alpha X+(1-\alpha)Y$ . We have $\varrho_{i}[Z_{i}]\leq\alpha\varrho_{i}[X_{i}]+(1-\alpha)\varrho_{i}[Y_{i}]$ , $i=1,\dots,m$ . Defining a random variable $Z^{\prime}$ on $\Omega_{m}$ by setting $Z^{\prime}(i)=\alpha\varrho_{i}[X_{i}]+(1-\alpha)\varrho_{i}[Y_{i}]$ , we obtain that $Z_{R}\leq Z^{\prime}$ . Using the monotonicity and convexity of $\varrho_{0},$ we obtain

\varrho_{0}[Z_{R}]\leq\varrho_{0}[Z^{\prime}]\leq\alpha\varrho_{0}[X_{R}]+(1-\alpha)\varrho_{0}[Y_{R}].

Hence $\varrho_{s}[\alpha X+(1-\alpha)Y]\leq\alpha\varrho_{s}[X]+(1-\alpha)\varrho_{s}[Y]$ .

(ii) Suppose the vectors $X,Y\in\mathcal{Z}$ satisfy $X\leq Y$ a.s. This implies that $X_{i}\leq Y_{i}$ a.s. and, hence, $\varrho_{i}[X_{i}]\leq\varrho_{i}[Y_{i}]$ for all $i=1,\dots,m$ by the monotonicity property of $\varrho_{i}$ . This further implies that $X_{R}\leq Y_{R}$ , entailing that $\varrho_{0}[X_{R}]\leq\varrho_{0}[Y_{R}]$ . Thus (A2) is satisfied.

(iii) Given a random vector $X\in\mathcal{Z}$ , $t>0$ , we have

\varrho_{s}[tX]=\varrho_{0}[(tX)_{R}]=\varrho_{0}[t(X_{R})]=t\varrho_{0}[X_{R}]

where we have used the positive homogeneity property of $\varrho_{i}[\cdot]$ for all $i=0,1,\dots,m$ .

(iv) Given a random vector $X\in\mathcal{Z}$ and a constant $a$ , we have $(X+a\mathbb{I})_{R}(i)=\varrho_{i}[X_{i}+a]=\varrho_{i}[X_{i}]+a$ . Hence $\varrho_{0}[(X+a\mathbb{I})_{R}]=\varrho_{0}[X_{R}]+a.$ This shows property (A4). ∎

Examples A. Systemic Mean-AVaR measure

Consider the case when $\varrho_{0}$ is a convex combination of the expected value and the Average Value-at-Risk at some level $\alpha$ and all components of $X$ are evaluated by the same measure of risk $\varrho[\cdot]$ . Then for any $\kappa\in[0,1]$ and $c\in S^{+}_{m}$ , we have:

	$\displaystyle\varrho_{s}[X]$	$\displaystyle=\varrho_{0}[X_{R}]=(1-\kappa)\mathbb{E}[X_{R}]+\kappa\operatorname{AVaR}_{\alpha}[X_{R}]$
		$\displaystyle=(1-\kappa)\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\inf_{\eta\in\mathbb{R}}\Big{\{}\eta+\frac{1}{\alpha}\sum_{i=1}^{m}c_{i}(\varrho[X_{i}]-\eta)_{+}\Big{\}}$

Here the infimum with respect to $\eta\in\mathbb{R}$ is taken over the individual risks of the components $\varrho[X_{i}]$ , $i=1,\dots,m$ . Hence, this method of aggregation imposes additional penalties for the components whose risk exceeds some threshold.

B. Systemic Mean-Semideviation measure

Now let $\varrho_{0}$ be a Mean-Upper-Semideviation risk measure of the first order and all components of $X$ are evaluated by the same measure of risk $\varrho[\cdot]$ . Then the measure of systemic risk can be defined as:

	$\displaystyle\varrho_{s}[X]$	$\displaystyle=\varrho_{0}[X_{R}]=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\sum_{i=1}^{m}c_{i}\Big{(}\varrho[X_{i}]-\sum_{j=1}^{m}c_{j}\varrho[X_{j}]\Big{)}_{+}$
		$\displaystyle=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\sum_{i=1}^{m}c_{i}\Big{(}\varrho[X_{i}]-\sum_{j=1}^{m}c_{j}\varrho[X_{j}]\Big{)}_{+}$

The last representation shows that this risk measure is an aggregation of the individual risk of the components, which compares the risk of every component with the weighted average risk of all components and penalizes the deviation of the individual risk from that average.

The presented method of non-linear aggregation maintains fairness within the system and keeps the components functioning within the same level of risk.

4 Relations to multivariate measures of systemic risk

In this section, we compare the proposed risk measures with the multivariate notions mentioned in section 2.2.

Consider first the Multivariate Value-at-Risk ( $\operatorname{MVaR}$ ) is given as the set of $p$ -efficient points of the respective probability distribution. The following facts are shown in [9]. For every $p\in(0,1)$ the level set $\mathcal{Z}_{p}$ of a the distribution function of a random vector $X$ is nonempty and closed. For a given scalarization vector $c\geq 0$ , the $p$ -efficient points can be generated by solving the following optimization problem:

	$\displaystyle\min$	$\displaystyle c^{\top}z$		(17)
	s.t.	$\displaystyle P(X\leq z)\geq p.$		(17)

For every $c\geq 0$ the solution set of the optimization problem (17) is nonempty and contains a $p$ -efficient point. Hence, given a random vector $X\in\mathcal{Z}$ and a scalarization vector $c\in S_{m}^{+}$ , $\operatorname{MVaR}$ at level $p\in(0,1)$ can be calculated as:

\displaystyle\operatorname{MVaR}_{p}(X)

\displaystyle=\inf\big{\{}c^{\top}X~{}|~{}P(X\leq z)\geq p\big{\}}=\inf\big{\{}c^{\top}X~{}:~{}X\in\mathcal{Z}_{p}\big{\}}.

Therefore, using linear scalarizations, one can find the $p$ -efficient point corresponding to any given vector $c\in S_{m}^{+}$ .

Consider now the Multivariate Average Value-at-Risk ( $\operatorname{MAVaR}$ ) defined in (4). When small outcomes are preferred then, the unfavorable set of realizations of a random vector $X$ is given by the $p$ -level set of $F(X;\cdot)$ . Hence $\operatorname{MAVaR}_{p}(X)=\mathbb{E}[\psi(X)~{}|~{}X\in\mathcal{Z}_{p}]$ . If $X(\omega)\in\mathcal{Z}_{p}$ , then there exists a $p$ -efficient point $v\in\mathbb{R}^{m}$ such that $X(\omega)\geq v$ . If the scalarization function $\psi(X)$ is monotonically nondecreasing, then $P(\psi(X)\leq\psi(v))\geq p$ . Denote the $p$ -quantile of $\psi(X)$ by $\eta_{X}(p)$ . Then we observe that $\eta_{X}(p)\leq\min_{v}\psi(v).$ Therefore:

	$\displaystyle\mathbb{E}[\psi(X)~{}\|~{}X\in\mathcal{Z}_{p}]$	$\displaystyle=\mathbb{E}[\psi(X)~{}\|~{}X\geq v]\geq\mathbb{E}[\psi(X)~{}\|~{}\psi(X)\geq\eta_{X}(p)]$
		$\displaystyle=\inf_{\eta}\bigg{\{}\eta+\frac{1}{p}\mathbb{E}[(\psi(X)-\eta)_{+}]\bigg{\}}=\operatorname{AVaR}_{p}(\psi(X))$

for all $p\in(0,1)$ where the cumulative distribution function of $\psi(X)$ is continuous. It follows that the Average Value-at-Risk of scalarized $X$ by a monotonically nondecreasing function $\psi(X)$ has a smaller value than $\operatorname{MAVaR}$ defined in (4). This implies in particular that for any $S\subset S_{m}^{+}$ , $\operatorname{MAVaR}_{p}(X)\geq\varrho_{S}[X].$

We not turn to the Vector-valued Multivariate Average Value-at-Risk. It is calculated as one of the Pareto-efficient optimal solution of the following optimization problem:

	$\displaystyle\min_{\eta\in\mathbb{R}^{m}}$	$\displaystyle\eta+\frac{1}{p}\mathbb{E}[(X-\eta)_{+}]$		(18)
	s.t.	$\displaystyle P(X\leq\eta)\geq 1-p.$		(18)

It is well-known that a feasible solution of a convex multiobjective optimization problem is Pareto-efficient if and only if it is an optimal solution of the scalarized problem with an objective function which is a convex combination of the multiple objectives. Then $\operatorname{VMAVaR}$ , which is the Pareto-efficient solution of the multiobjective optimization problem 18, is also optimal for the following problem:

	$\displaystyle\min_{v\in\mathbb{R}^{m}}$	$\displaystyle c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}]$		(19)
	s.t.	$\displaystyle P(X\leq v)\geq 1-p,$		(19)

where $c\in\mathbb{R}^{m}$ is a scalarization vector taken from the simplex $S_{m}^{+}$ .

Now for $X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m})$ , we consider:

c^{\top}(X-v)_{+}\geq\sum_{i=1}^{m}\max\{0,c_{i}(X_{i}-v_{i})\}=\max\{0,c^{\top}(X-v)\}

due to the convexity of the max function. It follows that:

\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}]\Big{\}}\geq\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v)_{+}]\Big{\}}.

In the scalar-valued case ( $m=1$ ) the minimizer of the optimization problem defining $\operatorname{AVaR}_{p}(Z)$ is the $\operatorname{VaR}_{p}(Z)$ for a random variable $Z$ . In the multivariate case ( $m>1$ ), we established that the solution of (17) is the $p$ -efficient point, or $\operatorname{VaR}_{p}(X)$ , corresponding to a given scalarization vector $c\in S_{m}^{+}$ . Denoting this $p$ -efficient point as $v(c)$ , it follows that:

c^{\top}v(c)+\frac{1}{p}\mathbb{E}[c^{\top}(X-v(c))_{+}]\geq c^{\top}v(c)+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v(c))_{+}]

The relation $P(X\leq v(c))\geq 1-p$ implies that $P(c^{\top}X\leq c^{\top}v(c))\geq 1-p$ . Denoting the $p$ -quantile of $c^{\top}X$ as $\eta_{X}(p;c)$ , it follows that: $\eta_{X}(p;c)\leq c^{\top}v(c),$ i.e. $\eta_{X}(p;c)$ is not larger than $c^{\top}v(c)$ . Therefore:

\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}]:P(X\leq v)>p\Big{\}}\\ \geq\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v)_{+}]\Big{\}}=\operatorname{AVaR}_{p}(c^{\top}X).

It follows that the scalarization of $\operatorname{VMAVaR}$ results in a smaller value of the Average Value-at-Risk of the scalarized random vector, which is one of the systemic risk measures following the constructions in section 3.

We do not pursue further investigation on set-valued systemic measures of risk as their calculation is numerically very expensive.

5 Two-stage stochastic programming problem with systemic risk measures

Our goal is to address a situation, when the agents cooperate on completing a common task and risk is associated (among other factors) with the successful completion of the task. This type of situations are typical in robotics, as well as in energy systems, where the units cover the energy demand in certain area.

5.1 Two-stage monotropic optimization problem with a systemic risk measure

In this section, we consider how the proposed approaches to evaluate systemic risk can be applied to a two-stage stochastic optimization problem with a monotropic structure. Specifically, we focus on a problem formulated as follows:

\displaystyle\min_{x\in\mathcal{X}}~{}

\displaystyle~{}f(x)+\varrho[Q(x;\xi)]

(20)

where $Q(x;\xi)$ has realizations $Q^{s}(x;\xi^{s})$ defined as the optimal value of the second-stage problem in scenario $s\in S$ :

$\displaystyle Q^{s}(x;\xi^{s})=\min_{\mathbf{y},z}~{}$	$\displaystyle~{}\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z)$	(21)
s.t.	$\displaystyle~{}T_{i}^{s}x+W_{i}^{s}y_{i}=h^{s}_{i},~{}~{}~{}i=1,\dots,m$	(22)
	$\displaystyle~{}\sum_{i=1}^{m}A^{s}_{i}y_{i}=b^{s}$	(23)
	$\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i}~{}~{}~{}i=1,\dots,m$	(24)
	$\displaystyle~{}B^{s}z\in\mathcal{D}^{s}$	(25)

Here $f:\mathbb{R}^{n}\to\mathbb{R}$ is a continuous function that represents the cost of the first-stage decision $x\in\mathbb{R}^{n}$ and $\mathcal{X}\subset\mathbb{R}^{n}$ is a closed convex set. The random vector $\xi$ comprises the random data of the second-stage problem. In the second-stage problem, we would like to minimize the sum of $m$ cost functions $g_{i}:\mathbb{R}^{l}\times\mathbb{R}^{p}\to\mathbb{R}$ for $i=1,\dots,m$ that depend on two second-stage decision variables: local decision variables $y_{i}\in\mathbb{R}^{l}$ for $i=1,\dots,m$ and the common decision variable $z\in\mathbb{R}^{p}$ . The decision variables $y_{i}\in\mathbb{R}^{l}$ are local for every $i=1,\dots,m$ , and the local constraints are represented as a closed convex set $\mathcal{Y}^{s}_{i}\subset\mathbb{R}^{l}$ . The decision variable $z\in\mathbb{R}^{p}$ is common for all $i$ and needs global information to be calculated. The matrix $B^{s}$ is of size $d\times p$ and the set $\mathcal{D}^{s}\subset\mathbb{R}^{d}$ is a closed convex set. Note that the constraints (22) linking the first-stage decision variable $x$ and the local second-stage decision variables $y_{i}$ are defined for every $i$ separately, where matrices $T^{s}_{i}\in\mathbb{R}^{k\times n}$ , $W^{s}_{i}\in\mathbb{R}^{k\times l}$ and $h^{s}_{i}\in\mathbb{R}^{k}$ depend on the scenario $s$ . The constraint (23) is a coupling constraint that links the local decision variables $y_{i}$ , where $A^{s}_{i}\in\mathbb{R}^{d\times l}$ and $b^{s}\in\mathbb{R}^{d}$ depend on the scenario $s\in S$ .

We define the total cost as the aggregation of the individual cost functions $g_{i}$ using some scalarization vector $c\in\mathbb{R}^{m}_{+}$ such that $\sum_{i=1}^{m}c_{i}=1$ and we would like to develop a numerical method to solve the two-stage problem in a distributed way. Specifically, we use decomposition ideas based on the risk-averse multicut method proposed in [15] and the multi-cut methods in risk-neutral stochastic programming to solve the two-stage problem, but we also decompose the second-stage problem into $m$ subproblems that can be solved independently in order to allow for a distributed operation of $m$ units (agents).

First, we discuss how to apply the decomposition method to solve the two-stage problem. We use the multicut method to construct a piecewise linear approximation of the optimal value of the second-stage problem and we approximate the measure of risk by subgradient inequalities based on the dual representation of coherent risk measures $\varrho[Q]=\sup_{\mu\in\mathcal{A}_{\varrho}}\langle\mu,Q\rangle$ . To this end, we introduce auxiliary variable $\eta\in\mathbb{R}$ , which will contain the lower approximation of the measure of risk. Further, we designate $Q$ the random variable with realizations $q^{s}$ which represent the lower approximations of the function $Q^{s}(\cdot,\xi^{s}).$ Then the master problem in our method takes on the following form:

$\displaystyle\min_{x,\eta,q}{}$	$\displaystyle~{}f(x)+\eta$	(26)
s.t.	$\displaystyle~{}\eta\geq\langle\mu^{\tau},Q\rangle,~{}~{}~{}\tau=1,\dots,t-1$
	$\displaystyle~{}q^{s}\geq\hat{q}^{s,\tau}+\langle g^{s,\tau},x-x^{\tau}\rangle,~{}~{}~{}\tau=1,\dots,t-1,~{}s=1,\dots,S$
	$\displaystyle~{}x\in\mathcal{X}.$

The optimal value $\hat{\eta}^{t}$ contains the value of the approximation of $\varrho[Q(\hat{x}^{t};\xi)],$ where $\hat{x}^{t}$ is the solution of the master problem at iteration $t$ . Notice that the approximation $\varrho^{t}[Q],$ of $\varrho[Q(\hat{x}^{t};\xi)],$ is given by

\hat{\eta}^{t}=\varrho^{t}[Q]=\max_{0\leq\tau\leq t-1}\langle Q,\mu^{\tau}\rangle

with $\mu^{\tau}$ being the probability measures from $\mathcal{A}_{\varrho}$ calculated as subgradients in the previous iterations. We shall explain how the subgradients $\mu^{\tau}$ are obtained in due course. The value $\hat{q}^{s,\tau}$ is the optimal value of the second-stage problem in scenario $s$ at iteration $\tau$ and $g^{s,\tau}$ is the subgradient calculated using the optimal dual variables of the constraints $\eqref{p:dynamics}$ . One can solve the second-stage problem where the objective function consists of a scalarization of $m$ cost functions, but we would like to decompose the second-stage problem into $m$ subproblems $Q^{s}_{i}$ that can be solved independently in a distributed manner.

Consider the second-stage problem $Q^{s}(x;\xi^{s})$ for a fixed first-stage decision variable $x\in\mathbb{R}^{n}$ . To decompose the global problem into $m$ local subproblems, we need to handle two problems: (i) distribute the common decision variable $z\in\mathbb{R}^{p}$ to individual subproblems $i$ ; (ii) decompose the coupling constraints. The common decision variable $z$ can be distributed to subproblems by creating its copy $z_{i}$ for every $i$ , where $i=1,\dots,m$ . Then we ensure the uniqueness of $z$ by enforcing the decision variables $z_{i}$ to be equal to each other. Then the second-stage problem $Q(x;\xi)$ can be rewritten as:

$\displaystyle Q^{s}(x;\xi^{s})=\min_{\mathbf{y},\mathbf{z}}~{}$	$\displaystyle~{}\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i})$	(27)
s.t.	$\displaystyle~{}T^{s}_{i}x+W^{s}_{i}y_{i}=h^{s}_{i},~{}~{}~{}i=1,\dots,m$	(28)
	$\displaystyle~{}\sum_{i=1}^{m}A^{s}_{i}y_{i}=b^{s}$	(29)
	$\displaystyle~{}z_{i}=z_{j}~{}~{}~{}i,j=1,\dots,m,$	(30)
	$\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i},~{}B^{s}z_{i}\in\mathcal{D}^{s}~{}~{}~{}i=1,\dots,m$	(31)

In order to distribute the coupling constraints (29), (30), we can apply Lagrange relaxation using Lagrange multipliers $\lambda^{s}\in\mathbb{R}^{d}$ and $\mu^{s}\in\mathbb{R}^{m\times m}$ . Then the global augmented Lagrangian problem $\Lambda^{s}_{\kappa_{0}}$ associated with the second-stage problem is defined as:

	$\displaystyle\Lambda^{s}_{\kappa_{0}}(\mathbf{y},\mathbf{z})$	$\displaystyle=\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda^{s},\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\rangle+\sum_{i=1}^{m}\sum_{j=1}^{m}\mu^{s}_{ij}(z_{i}-z_{j})+\frac{\kappa_{0}}{2}\bigg{\\|}\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\bigg{\\|}^{2}$
		$\displaystyle+\frac{\kappa_{0}}{2}\sum_{i=1}^{m}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-z_{j})^{2}$
		$\displaystyle=\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i})+\sum_{i=1}^{m}\langle\lambda^{s},A^{s}_{i}y_{i}\rangle+\sum_{i=1}^{m}\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i}-\langle\lambda^{s},b^{s}\rangle+\frac{\kappa_{0}}{2}\bigg{\\|}\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\bigg{\\|}^{2}$
		$\displaystyle+\frac{\kappa_{0}}{2}\sum_{i=1}^{m}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-z_{j})^{2}$

where $\kappa_{0}>0$ is a penalty coefficient. This problem can be decomposed into subproblems with a local augmented Lagrangian $\Lambda_{\kappa_{0}}^{s,i}$ defined as:

	$\displaystyle\Lambda^{s,i}_{\kappa_{0}}(y_{i},\tilde{y},z_{i},\tilde{z},\lambda,\mu)$	$\displaystyle=c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda^{s},A^{s}_{i}y_{i}\rangle+\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i}$
		$\displaystyle+\frac{\kappa_{0}}{2}\bigg{\\|}A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A^{s}_{j}\tilde{y}_{j}-b^{s}\bigg{\\|}^{2}+\frac{\kappa_{0}}{2}\bigg{(}2\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}+\sum_{k\neq i}\sum_{\begin{subarray}{c}j=1\\ j\neq k,i\end{subarray}}^{m}(\tilde{z}_{k}-\tilde{z}_{j})^{2}\bigg{)}$

where $(y_{i},z_{i})$ are decision variables of subproblem $i$ , $\tilde{y}$ and $\tilde{z}$ contain given optimal decisions of other subproblems $j=1,\dots,m,~{}j\neq i$ . Note that the first penalty term can be expanded as:

\displaystyle\|A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A_{j}^{s}\tilde{y}_{j}-b^{s}\|^{2}=\sum_{k=1}^{d}\bigg{(}[A^{s}_{i}y_{i}]_{k}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}[A_{j}^{s}\tilde{y}_{j}]_{k}-b_{k}^{s}\bigg{)}^{2}

This implies that for $k$ such that $[A^{s}_{i}y_{i}]_{k}=0$ , the remaining terms are constant with respect to $i$ and can be omitted from the optimization problem. Hence, the subproblem $i$ needs access to the decisions of $j=1,\dots,m,~{}j\neq i$ which are coupled with $i$ in constraint $k$ . Similarly, consider the second penalty term:

\displaystyle 2\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}+\sum_{k\neq i}\sum_{\begin{subarray}{c}j=1\\ j\neq k,i\end{subarray}}^{m}(\tilde{z}_{k}-\tilde{z}_{j})^{2}

The terms contained in the last summation that do not include $z_{i}$ are constants and can be excluded from the optimization problem. Hence, we can define the subproblem for every $i$ as follows:

$\displaystyle Q^{s}_{i}(x,\tilde{y},\tilde{z},\lambda,\mu,\xi^{s})=\min_{y_{i},z_{i}}~{}$	$\displaystyle~{}c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda,A^{s}_{i}y_{i}\rangle+\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i}$	(32)
	$\displaystyle~{}+\frac{\kappa_{0}}{2}\\|A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A_{j}^{s}\tilde{y}_{j}-b^{s}\\|^{2}+\kappa_{0}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}$
s.t.	$\displaystyle~{}T^{s}_{i}x+W^{s}_{i}y_{i}=h^{s}_{i}$	(33)
	$\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i},\;~{}B^{s}z_{i}\in\mathcal{D}^{s}_{i}$	(34)

For a fixed $x$ , in every scenario $s\in S$ , we can implement the Accelerated Distributed Augmented Lagrangian (ADAL) method, we refer to [6] for detailed analysis of the method. The method consists of three main steps: (i) we solve every subproblem $Q^{s}_{i}$ to find the optimal primal variables $(\hat{y}_{i}^{s},\hat{z}_{i}^{s})$ ; (ii) update the primal variables and check if the coupling constraints (29), (30) are satisfied; (iii) if the constraints are not satisfied, update the dual variables and go back to step (i). The ADAL method converges to the optimal solution $(\hat{y}^{s},\hat{z}^{s},\hat{\lambda}^{s},\hat{\mu}^{s})$ in a finitely many steps and we can calculate the optimal value of the objective function $\hat{Q}^{s}_{i}$ for every subproblem $i$ . Then the global objective function of the second-stage problem can be calculated as $\hat{Q}^{s}(x;\xi^{s})=\sum_{i=1}^{m}\hat{Q}^{s}_{i}(x;\xi^{s})-\langle\hat{\lambda}^{s},b^{s}\rangle$ .

Once the second-stage problem is solved for every scenario $s$ , we construct objective cuts for every scenario $s\in S$ defined as:

Q^{s}(x;\xi^{s})\geq\hat{Q}^{s}(x^{k};\xi^{s})+\langle g^{s,k},x-x^{k}\rangle

where $g^{s,k}$ is the subgradient of $Q^{s}(x;\xi^{s})$ at $x=x^{k}$ and scenario $s\in S$ . Now note that

\partial Q^{s}(x;\xi^{s})=\partial\bigg{[}\sum_{i=1}^{m}Q^{s}_{i}(x;\xi^{s})-\langle\lambda,b^{s}\rangle\bigg{]}=\sum_{i=1}^{m}\partial Q^{s}_{i}(x;\xi^{s})

Hence, at $x=x^{k}$ , the subgradient for scenario $s\in S$ can be calculated as $\partial Q^{s}(x^{k};\xi^{s})=\sum_{i=1}^{m}\partial Q^{s}_{i}(x^{k};\xi^{s})$ . The subgradient $\partial Q_{i}^{s}(x^{k};\xi^{s})$ is given as $-(T^{s}_{i})^{\top}\pi^{s}_{i}$ , where $\pi^{s}_{i}$ is the Lagrange multiplier associated with the constraint (28) in subproblem $i$ . Then the proposed method for solving the two-stage problem is formulated as follows:

Step 0. Set $t=1$ and define initial $\mu^{0}\in\mathcal{A}_{\varrho}$ .
Step 1. Solve the master problem (26) and denote its optimal solution as $(x^{t},\eta^{t},q^{t})$ .

Step 2. For every scenario $s=1,\dots,S$ apply the following method.

(a) Set $l=1$ and define initial Lagrange multipliers $\lambda^{s,1}$ , $\mu^{s,1}$ and initial primal variables $y^{s,1},z^{s,1}$ .

(b) Given the Lagrange multipliers $\lambda^{s,1},\mu^{s,1}$ and decision variables of the neighboring nodes $y^{s,l},z^{s,l}$ , every node $i$ calculates its optimal solution $(\hat{y}_{i}^{s,l},\hat{z}_{i}^{s,l})$ by solving its local problem:

	$\displaystyle\min_{y_{i}^{s},z_{i}^{s}}{}$	$\displaystyle~{}\Lambda^{s,i}_{\kappa_{0}}(y_{i}^{s},y^{s,l},z_{i}^{s},z^{s,l},\lambda^{s,l},\mu^{s,l})$		(35)
	s.t.	$\displaystyle~{}y_{i}^{s}\in\mathcal{Y}^{s}_{i},~{}B^{s}z_{i}^{s}\in\mathcal{D}^{s}_{i}$		(35)

(c) Every node $i$ updates its primal variables:

$y_{i}^{s,l+1}=y_{i}^{s,l}+\kappa_{s}(\hat{y}_{i}^{s,l}-y_{i}^{s,l})$

$z_{i}^{s,l+1}=z_{i}^{s,l}+\kappa_{s}(\hat{z}_{i}^{s,l}-z_{i}^{s,l})$

(d) If the constraints

\begin{gathered}\sum_{i=1}^{m}A_{i}^{s}y_{i}^{s}=\mathbf{b}^{s},\quad z_{i}^{s}=z_{j}^{s},~{}i,j=1,\dots,m\end{gathered}

(36)

are satisfied, then calculate the following quantities and go to Step 3:

	$\displaystyle g^{s,t}=\sum_{i=1}^{m}g_{i}^{s,l}=\sum_{i=1}^{m}(-T_{i}^{s})^{\top}\pi_{i}^{s,l}$
	$\displaystyle\hat{L}^{s,t}=\sum_{i=1}^{m}\hat{\Lambda}^{s,i,l}_{\kappa_{0}}-\sum_{i=1}^{m}\lambda_{i}^{s,l}b_{i}^{s}$

where $\pi_{i}^{s,l}$ is the optimal Lagrange multiplier associated with the constraint (33) in subproblem $i$ and $\hat{\Lambda}^{s,i,l}_{\kappa_{0}}$ is the optimal value of the objective function (35).

If any of the constraints (36) are not satisfied, update their Lagrange multipliers as follows:

	$\displaystyle\lambda^{s,l+1}=\lambda^{s,l}+\kappa_{0}^{s}\kappa_{s}\bigg{(}\sum_{i=1}^{m}A_{i}^{s}y_{i}^{s,l}-b^{s}\bigg{)}$
	$\displaystyle\mu_{i,j}^{s,l+1}=\mu_{i,j}^{s,l}+\kappa_{0}^{s}\kappa_{s}(z_{i}^{s,l}-z_{j}^{s,l})$

Increase $l$ by one and return to Step (b).

Step 3. Calculate $\varrho^{t}=\varrho[\hat{L}^{t}]$ and $\mu^{t}\in\partial\rho[\hat{L}^{t}]$ .
Step 4. If $\varrho^{t}=\eta^{t}$ , stop; otherwise, increase $t$ by one and go to Step 1.

Note that the penalty parameter $\kappa^{s}_{0}$ can be chosen for every scenario $s\in S$ according to the structure of the problem. The ADAL method converges to the optimal solution in scenario $s$ if the penalty parameter $\kappa^{s}_{0}\in(0,\frac{1}{q^{s}})$ , where $q$ is the maximum number of nonzero rows in matrices $A^{s}_{i}$ for $i=1,\dots,m$ . Hence, $\kappa^{s}_{0}$ can be chosen close to $\frac{1}{q^{s}}.$

5.2 Two-stage wireless information exchange problem

In this section, we formulate a two-stage information exchange problem and implement the proposed numerical method to solve it. Consider a problem in a wireless communication network consisting of $J$ robots. We denote the team of all robots by $\mathcal{J}$ . The robots collect information about the unknown environment and send the information to a set $\mathcal{K}=\{1,\dots,K_{0}\}$ of active reporting points by multi-hop communication. The active reporting points can receive information from robots and store it. The communication links between robots and reporting points are subject to the risk of information loss. Therefore, the objective is to choose the optimal set of active reporting points to minimize the risk associated with the amount of information lost and the proportion of the total information that was gathered but has not reached the reporting points. To this end, we shall formulate a two-stage stochastic programming problem.

The first-stage decision variables are known as the here and now variables. In our problem, these are binary variables $z_{k}\in\{0,1\}$ for $k\in\mathcal{K}$ , where $z_{k}=1$ if the $k$ -th location is selected as an active reporting point is active and $z_{k}=0$ otherwise. We assume that at most $K$ reporting points can be active, where $1\leq K<K_{0}.$

Once the reporting points are chosen, the spatial configuration of the robots is observed. We model the uncertainty of the spacial configuration and the amount of information to be observed by a set $\mathcal{S}=\{1,\dots S\}$ of scenarios. The robots gather information about the environment and either deliver it to the reporting points or exchange it with their neighbors who then deliver it to the active reporting points. The following second-stage decision variables are involved in the second-stage optimization problem. The variables $T_{ij}^{s}$ stand for the amount of information that is sent by node $i$ to node $j$ in scenario $s\in\mathcal{S}$ . The amount of information observed but not sent by robot $i$ in scenarios $s$ is denoted by $y_{i}^{s}$ . The proportion of information successfully delivered to the reporting points in scenario $s$ is denoted by $x^{s}$ . Every robot $i$ generates information $r_{i}^{s}$ and can send it to its neighbors within some communication range. These communication links between the nodes depend on a function $R_{ij}^{s}$ that calculates what proportion of information sent by node $i\in\mathcal{J}$ is received and correctly decoded by node $j\in\mathcal{J}\cup\mathcal{K}$ . Then $R_{ij}^{s}T_{ij}^{s}$ is the amount of information received and correctly decoded by node $j\in\mathcal{J}\cup\mathcal{K}$ in scenario $s\in\mathcal{S}$ . Then the set of neighbors of node $i$ in scenario $s\in\mathcal{S}$ can be defined as the set of nodes within its communication range $\mathcal{N}^{s}(i)=\{j\in\mathcal{J}\cup\mathcal{K}:R_{ij}^{s}>0\}$ .

We associate a local risk with each robot about the information that is not communicated to neighbors or delivered to the reporting points because the information might be lost due to damage to the robot or other issues. For every $i\in\mathcal{J}$ we define:

y_{i}^{s}=r_{i}^{s}+\sum_{j\in\mathcal{J}}R_{ji}^{s}T_{ji}^{s}-\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}

as the amount of information not communicated to neighbors nor to any of the reporting points by robot $i\in\mathcal{J}$ . The systemic risk associated with the team of robots is represented by the total proportion of information not delivered to the reporting points; it is defined as $(1-x^{s})$ , where $x^{s}$ is calculated as follows:

x^{s}\sum_{i\in\mathcal{J}}r_{i}^{s}=\sum_{i\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{ik}^{s}T_{ik}^{s}.

(37)

To implement the distributed method for the operation of the robots, we introduce copies of the total proportion variable for each robot (denoted $x_{i}^{s}$ ). We then introduce additional constraints to impose equality among the auxiliary variables $x_{i}^{s}$ . Constraint (37) is then replaced by the following set of constraints:

	$\displaystyle\sum_{i\in\mathcal{J}}x_{i}^{s}r_{i}^{s}=\sum_{j\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{jk}^{s}T_{jk}^{s}$
	$\displaystyle x_{i}^{s}=x_{j}^{s}\quad\forall i,j\in\mathcal{J}.$

Using these variables, we can express the loss function of every robot $i\in\mathcal{J}$ in scenario $s\in\mathcal{S}$ as follows

q_{i}^{s}=c_{1}y_{i}^{s}+c_{2}(1-x_{i}^{s}),

where $c_{1}>0$ is the weight associated with the local risk, while $c_{2}>0$ is the systemic risk. These are modeling parameters. We have used a choice of $c_{1}+c_{2}=1$ in-line with our theoretical proposal for aggregating sources of risk. The first-stage optimization problem takes on the following form:

	$\displaystyle\min_{z}~{}$	$\displaystyle~{}\varrho[Q(z;\xi)]$
	s.t.	$\displaystyle~{}\sum_{k\in\mathcal{K}}z_{k}\leq K$
		$\displaystyle~{}z_{k}\in\{0,1\},~{}~{}~{}k\in\mathcal{K}.$

Here $Q(z;\xi)$ is a random variable with realizations $Q^{s}(z;\xi^{s})$ for $s=1,\dots,S$ denoting the optimal values of the second-stage problem. The second-stage problem deals with the operation of the robots after the location of the reporting points is fixed; it is formulated as follows:

$\displaystyle Q^{s}(z;\xi^{s})$	$\displaystyle~{}=$
$\displaystyle\min_{T^{s},y^{s},x^{s}}$	$\displaystyle~{}\sum_{i\in\mathcal{J}}w_{i}q_{i}^{s}\quad\text{s.t.}$	(38)
	$\displaystyle~{}y_{i}^{s}=r_{i}^{s}+\sum_{j\in\mathcal{J}}R_{ji}^{s}T_{ji}^{s}-\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s},~{}~{}~{}i\in\mathcal{J}$	(39)
	$\displaystyle~{}\sum_{j\in\mathcal{J}}T_{ji}^{s}+\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}\leq a,~{}~{}~{}i\in\mathcal{J}$	(40)
	$\displaystyle~{}\sum_{i\in\mathcal{J}}x_{i}^{s}r_{i}^{s}=\sum_{j\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{jk}^{s}T_{jk}^{s}$	(41)
	$\displaystyle~{}x_{i}^{s}=x_{j}^{s},~{}~{}~{}i,j\in\mathcal{J}$	(42)
	$\displaystyle~{}T_{ik}^{s}\leq Mz_{k},~{}~{}~{}i\in\mathcal{J},k\in\mathcal{K}$	(43)
	$\displaystyle~{}T_{ij}^{s}\geq 0,~{}~{}~{}i\in\mathcal{J},j\in\mathcal{J}\cup\mathcal{K}$	(44)
	$\displaystyle~{}y_{i}^{s}\geq 0,~{}~{}~{}i\in\mathcal{J}.$	(45)

Here $M>0$ is a large constant, which helps us provide a logical upper bound for communication only to the active reporting points in constraint (43). Additionally, $w_{i}>0$ are weights associated with the loss functions of individual robots. Here again, we propose $\sum_{i\in\mathcal{J}}w_{i}=1$ according to the proposed systemic measures of risk. We notice that the second-stage problems are always feasible for any feasible first-stage decision. Hence, the two-stage problem has a relatively complete recourse. Furthermore, the recourse function $Q(z;\xi)$ has finitely many realizations $Q^{s}(z;\xi^{s}),$ $s=1,\dots,S$ for every fixed argument $z$ . This implies that we can use any coherent measure of risk $\varrho[\cdot]$ for evaluating the risk of $Q(\cdot,\xi)$ ; the value $\varrho[Q(z;\xi)]$ is well defined and finite for all feasible first-stage decisions $z.$

5.3 Numerical results

We solve the problem using the distributed method proposed in 5.1. In the given problem, the decision variables $T_{ij}^{s},y_{i}^{s},x_{i}^{s}$ are local to every node $i\in\mathcal{J}$ and there are four coupling constraints that need to be distributed to the nodes:

(1)

the flow conservation constraints (39);
(2)

the transmission capacity constraints (40);
(3)

the proportion constraint (41);
(4)

the equality constraints enforcing the uniqueness of the proportion (42).

Since the ADAL algorithm operates equality constraints, we can introduce auxiliary variables $u_{i}^{s}$ and redefine (40) as $\sum_{j\in\mathcal{J}}T_{ji}^{s}s+\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}+u_{i}^{s}=a$ . Then the coupling constraints can be rewritten using appropriate matrices and vectors stacking decision variables of every node $i\in\mathcal{J}$ . Let $v_{i}^{s}=[y_{i}^{s},T_{i1}^{s},T_{i2}^{s},T_{i3}^{s},\dots,T_{iJ}^{s},\dots,T_{i(J+K_{0})}^{s},x_{i}^{s},u_{i}^{s}]^{\top}$ be a $(J+K_{0}+3)$ -dimensional vector stacking all decision variables of the node $i\in\mathcal{J}$ . Then the flow conservation constraint (39) can be rewritten as $\sum_{i\in\mathcal{J}}A_{i}^{s}v_{i}^{s}=\mathbf{r}^{s}$ , where $\mathbf{r}^{s}=[r_{1}^{s},\dots,r_{J}^{s}]^{\top}$ is a $J$ -dimensional vector and $A_{i}^{s}$ is $J\times(J+K_{0}+3)$ dimensional matrix defined as:

A_{i}^{s}=\begin{bmatrix}1&1-R_{ii}^{s}&1&1&\dots&1&\dots&1&0&0\\ 0&0&-R_{i2}^{s}&0&\dots&0&\dots&0&0&0\\ 0&0&0&-R_{i3}^{s}&\dots&0&\dots&0&0&0\\ \vdots&\vdots&\vdots&&\vdots&&\vdots&\vdots\\ 0&0&0&0&\dots&-R_{iJ}^{s}&\dots&0&0&0\\ \end{bmatrix}

where all elements in the $i$ -th row are equal to 1 except of terms $(i,i+1)$ and the last two columns which are equal to $0$ . Note that $R_{ii}^{s}=1$ for all $i\in\mathcal{J}$ , hence the terms $1-R_{ii}^{s}$ are equal to $0$ . Similarly, constraints (40) and (41) can be rewritten as

\displaystyle\sum_{i\in\mathcal{J}}B_{i}^{s}v_{i}^{s}=a\mathbf{1},\quad\sum_{i\in\mathcal{J}}(C_{i}^{s})^{\top}v_{i}^{s}=0,\quad

using appropriate matrices $B_{i}^{s},C_{i}^{s}$ for every node $i\in\mathcal{J}$ , where $\mathbf{1}$ is a vector of all ones. Note that since nodes can share information only with the neighbors, one can enforce the equality of the proportion variable between neighboring nodes and rewrite the constraint (42) as follows:

\displaystyle x_{i}^{s}=x_{j}^{s},~{}~{}~{}i\in\mathcal{J},~{}j\in\mathcal{N}^{s}(i)

(46)

where $\mathcal{N}^{s}(i)$ is the set of nodes within communication range of node $i\in\mathcal{J}$ in scenario $s\in\mathcal{S}$ . If the network is connected, constraint (46) enforces all $x_{i}$ to be equal to each other and ensures the consistency and uniqueness of $x_{i}$ .

We assume that the team of robots works on a square map given by the points with relative coordinates $(0,0)$ and $(2,2)$ . The spatial distribution of available information to be gathered follows a normal distribution with an expected value $\mathcal{C}=(0.5,1.75)$ in the upper left corner of the map. The network consists of 50 robots and 4 potential locations of the reporting points. We generated 200 scenarios for different spatial configurations of the robots. The four potential locations for the reporting points are fixed in the positions $(0.5,0.3),(1.5,0.25),(1.75,0.5),(1,0.2)$ . The rate function $R_{ij}^{s}$ depends on the distance between the nodes in the network and is defined as:

R_{ij}^{s}=\begin{cases}1,&\text{if }\|d_{ij}^{s}\|\leq\ell,\\ a\|d_{ij}^{s}\|^{3}+b\|d_{ij}^{s}\|^{2}+c\|d_{ij}^{s}\|+e,&\text{if }\ell<\|d_{ij}^{s}\|\leq u,\\ 0,&\text{if }\|d_{ij}^{s}\|>u,\end{cases}

where $\|d_{ij}^{s}\|$ is the distance between the nodes in scenario $s\in\mathcal{S}$ . We set $\ell=0.3$ and $u=0.6$ , and values $a,b,c$ and $e$ are chosen so that $R_{ij}^{s}$ is a continuous function. This function is commonly used in literature, see e.g. [32]. The information $r_{i}^{s}$ gathered by robot $i$ in scenario $s$ , depends on the robot’s position relative to the expected value $\mathcal{C}$ given above. In our experiments $r_{i}^{s}$ is calculated as follows:

r_{i}^{s}=\frac{w}{2\pi|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(d_{i}-\mathcal{C})\Sigma^{-1}(d_{i}-\mathcal{C})},

where $d_{i}$ is the positions of robot $i\in\mathcal{J},$ $w$ is a scaling factor, and $\Sigma$ is a covariance matrix, which keep fixed for all experiments.

Comparison of aggregation methods. We solved the optimization problem using two different aggregation methods:

•

aggregate first Using the proposed multivariate measures of risk, we aggregate the individual losses of the robots with a fixed scalarization $w$ , we calculate for each scenario $V^{s}=\sum_{i\in\mathcal{J}}w_{i}q_{i}^{s}$ and evaluate its risk $\varrho[V]$ by several scalar-valued measures of risk;
•

evaluate first We evaluate the individual risk of every robot across all scenarios and calculate $V_{i}=\varrho_{i}[q_{i}]$ . Then we aggregate their values $\varrho_{S}[V]$ using two examples of nonlinear aggregation shown in section 4.2.1.

We solve the problem using a linear scalarization vector $w$ with equal weights $w_{i}=\frac{1}{J}$ for all $i\in\mathcal{J}$ , $c=[0.8,0.2]$ and $\operatorname{AVaR}_{\alpha}(\cdot)$ for three values of $\alpha=0.1,0.2,0.3$ .

The setup of the communication network problem and the optimal solutions in one of the scenarios for both methods are shown in Fig. 1. One can notice that depending on what kind of aggregation method is used, the set of optimal reporting points might be different. The distribution of the proportion $x$ of information delivered to the reporting points for two methods is shown in Fig. 2. It can be seen that more information is delivered to the reporting points if we aggregate the losses of robots and evaluate the risk. This observation is also reflected in the values of the risk for both methods: imposing a risk measure on linear scalarization of the individual losses results in smaller values than aggregation of individual risks.

Using the optimal values of the decision variables, we can calculate $\operatorname{MAVaR}$ and $\operatorname{VMAVaR}$ to compare their values with the AVaR applied on linear scalarization of the random cost. The following formulas were used to calculate the values:

	$\displaystyle\operatorname{AVaR}_{\alpha}(V)=\inf_{\eta\in\mathbb{R}}\bigg{\{}\eta+\frac{1}{\alpha}\mathbb{E}\Big{[}(w^{\top}q-\eta)_{+}\Big{]}\bigg{\}}$		(47)
	$\displaystyle\operatorname{MAVaR}_{\alpha}(V)=\mathbb{E}\Big{[}w^{\top}q~{}\|~{}q\in\mathcal{Z}_{1-\alpha}\Big{]}$		(48)
	$\displaystyle\operatorname{VMAVaR}_{\alpha}(V)=\inf_{\eta\in\mathbb{R}^{2}}\bigg{\{}w^{\top}\eta+\frac{1}{\alpha}\mathbb{E}\Big{[}w^{\top}(V-\eta)_{+}\Big{]}:\Pr[V\leq\eta]>1-\alpha\bigg{\}}$		(49)

The values of $\operatorname{AVaR}$ , $\operatorname{MAVaR}$ and $\operatorname{VMAVaR}$ are shown in Table 1. It can be seen that $\operatorname{AVaR}_{\alpha}(V)$ results in smaller values than $\operatorname{MAVaR}_{\alpha}(V)$ and $\operatorname{VMAVaR}_{\alpha}(V)$ at all confidence levels $\alpha$ as it was shown theoretically in section 4.Those measures of risk are computationally very demanding and not amenable to the type of decision problems, we are considering. Hence, we only compare their values for the decision obtained via our proposed method.

	0.1	0.2	0.3
$\operatorname{AVaR}_{\alpha}$	0.1429	0.1389	0.1364
$\operatorname{MAVaR}_{\alpha}$	0.1992	0.1693	0.1634
$\operatorname{VMAVaR}_{\alpha}$	0.174	0.1622	0.1553

Table 1: Comparison of

\operatorname{AVaR}

\operatorname{MAVaR}

and

\operatorname{VMAVaR}

values for

\alpha=0.1,0.2,0.3

When we solve the problem in a distributed way, we use a smaller network consisting of 20 robots and 4 reporting points in a 1.5 by 1.5 square over 100 scenarios. It is assumed that the network is connected in all possible scenarios, that is, every node has at least one neighbor within the communication range, and all nodes are connected to the reporting points through multiple hops. This assumption is necessary for the proper calculation of the proportion of information delivered to the reporting points. If one of the nodes is isolated from the network, the rest of the group converges to a solution that does not take into account the isolated node’s contribution. The problem is solved in both centralized and distributed ways, and the results for one of the scenarios are shown in Fig. 3. As it can be seen in Fig. 3 (b), nodes converge to the centralized solution of the proportion of information delivered to the reporting points.

6 Conclusions

Our contributions can be summarized as follows. We propose a sound axiomatic approach to measures of risk for distributed systems. We show that several classes of non-trivial measures that satisfy the axioms can be constructed. These measures can be calculated efficiently and are less conservative than most of the other systemic measures of risk. The class of measures proposed in section 3.3 goes beyond the popular ways to evaluate risk of agents then aggregate.

We have devised a distributed method for solving the risk-averse two-stage problems with monotropic structure, which works for any measure of risk not only for those that are representable as expected value. The construction is quite general and could serve as a template for devising other distributed methods for problems with systemic measures of risk.

We demonstrate the viability of the proposed framework on a non-trivial two-stage problem involving wireless communication. The numerical experiments confirm the theoretical observations and show the advantage of the proposed approach to risk aggregation in distributed systems.

In conclusion, the advantage of the new approach is the good balance of robustness to the uncertainty, optimality of the loss functions involved, and the efficiency of the numerical operation.

References

[1] Çağın Ararat, and Birgit Rudloff. Dual representations for systemic risk measures. Mathematics and Financial Economics, 14:139-174, 2020.
[2] Philippe Artzner, Freddy Delbaen, Jean-Marc Eber, and David Heath. Coherent measures of risk. Mathematical Finance, 9(3):203–228, July 1999.
[3] Francesca Biagini, Jean-Pierre Fouque, Marco Frittelli, and Thilo Meyer-Brandis. A Unified Approach to Systemic Risk Measures via Acceptance Sets. 2015.
[4] Markus Brunnermeier and Patrick Cheridito. Measuring and Allocating Systemic Risk. Risks, 7(2), 2019.
[5] Christian Burgert and Ludger Rüschendorf. Consistent risk measures for portfolio vectors. Insurance: Mathematics and Economics, 38(2):289–297, 2006.
[6] Nikolaos Chatzipanagiotis, Darinka Dentcheva, and Michael Zavlanos. An augmented lagrangian method for distributed optimization. Mathematical Programming, 152, 01 2014.
[7] Chen Chen, Garud Iyengar, and C. Ciamac Moallemi. An Axiomatic Approach to Systemic Risk. Management Science, 59(6):1373–1388, 2013.
[8] Freddy Delbaen. Coherent risk measures on general probability spaces. Advances in Finance and Stochastics, pages 1–37, March 2000.
[9] Darinka Dentcheva, Bogumila Lai, and Andrzej Ruszczyński. Dual methods for probabilistic optimization. Mathematical Methods of Operations Research, 60:331–346, 2004.
[10] Ivar Ekeland and Walter Schachermayer, Law invariant risk measures on $\mathcal{L}^{\infty}(\mathbb{R}^{d})$ Statistics & Risk Modeling, 28(3): 195–225, 2011.
[11] Kromer Eduard, Overbeck Ludger, and Zilch Katrin. Systemic Risk Measures on General Measurable Spaces. 05 2016.
[12] Zachary Feinstein, Birgit Rudloff, and Stefan Weber. Measures of Systemic Risk. SIAM Journal on Financial Mathematics, 8(1):672–708, Jan 2017.
[13] Hans Föllmer and Alexander Schied. Convex measures of risk and trading constraints. Finance and Stochastics, 6:429–447, 2002.
[14] Hans Föllmer and Alexander Schied. Stochastic Finance: An Introduction in Discrete Time, 3rd Edition. Walter De Gruyter, 2011.
[15] Sıtkı Gülten and Andrzej Ruszczyński. Two-stage portfolio optimization with higher-order conditional measures of risk. Annals of Operations Research, 229:409–427, 06 2015.
[16] Andreas Hamel, and Frank Heyde, Duality for set-valued measures of risk, SIAM Journal on Financial Mathematics, 1(1):66–95,2010.
[17] Elyes Jouini, Moncef Meddeb, and Nizar Touzi, Vector-valued coherent risk measures, Finance and stochastics, 8(4): 531–552, 2004.
[18] M. Kijima and M. Ohnishi. Mean-risk analysis of risk aversion and wealth effects on optimal portfolios with multiple investment opportunities. Ann. Oper. Res., 45:147–163, 1993.
[19] Jinwook Lee and András Prékopa. Properties and calculation of multivariate risk measures: MVaR and MCVaR. Annals of Operations Research, 211:225–254, 2013.
[20] J. Leitner. A short note on second-order stochastic dominance preserving coherent risk measures. Mathematical Finance, 15:649–651, 2005.
[21] Ma, Wann-Jiun and Oh, Chanwook and Liu, Yang and Dentcheva, Darinka and Zavlanos, Michael M., Risk-Averse Access Point Selection in Wireless Communication Networks, IEEE Transactions on Control of Network Systems, 6(1): 24–36, 2019.
[22] Merve Meraklı and Simge Küçükyavuz. Vector-Valued Multivariate Conditional Value-at-Risk. Operations Research Letters, 46(3):300–305, 2018.
[23] Nilay Noyan and Gábor Rudolf. Optimization with multivariate conditional value-at-risk constraints. Operations research, 61(4):990–1013, 2013.
[24] Georg Pflug and Alois Pichler, Systemic risk and copula models, Central European Journal of Operations Research, 26: 465–483 2018.
[25] G.Ch. Pflug and W. Römisch. Modeling, Measuring and Managing Risk. World Scientific, Singapore, 2007.
[26] András Prékopa. Multivariate Value at Risk and Related Topics. Annals of Operations Research, 193:49–69, 2012.
[27] R Tyrrell Rockafellar and Stan Uryasev. The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surveys in Operations Research and Management Science, 18(1-2):33–53, 2013.
[28] Ludger Rüschendorf. Mathematical Risk Analysis. Springer, 2015.
[29] A. Ruszczyński and A. Shapiro. Optimization of risk measures. In G. Calafiore and F. Dabbene, editors, Probabilistic and Randomized Methods for Design under Uncertainty, pp. 117–158, Springer-Verlag, London, 2005.
[30] A. Ruszczyński and A. Shapiro. Optimization of convex risk functions. Mathematics of Operations Research, 31:433–452, 2006.
[31] Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczyński. Lectures on Stochastic Programming: Modeling and Theory. Society for Industrial & Applied Mathematics (SIAM), 2009.
[32] Michael M. Zavlanos, Alejandro Ribeiro, and George J. Pappas. Network integrity in mobile robotic networks. IEEE Transactions on Automatic Control, 58(1):3–18, 2013.