This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On Risk Evaluation and Control of Distributed Multi-Agent Systems

Aray Almen
Department of Mathematical sciences
Stevens Institute of Technology
Hoboken, NJ 07030, USA
aalmen@stevens.edu
&Darinka Dentcheva
Department of Mathematical sciences
Stevens Institute of Technology
Hoboken, NJ 07030, USA
darinka.dentcheva@stevens.edu
Abstract

In this paper, we deal with risk evaluation and risk-averse optimization of complex distributed systems with general risk functionals. We postulate a novel set of axioms for the functionals evaluating the total risk of the system. We derive a dual representation for the systemic risk measures and propose a way to construct non-trivial families of measures by using either a collection of linear scalarizations or non-linear risk aggregation. The new framework facilitates risk-averse sequential decision-making by distributed methods. The proposed approach is compared theoretically and numerically to some of the systemic risk measurements in the existing literature.

We formulate a two-stage decision problem with monotropic structure and systemic measure of risk. The structure is typical for distributed systems arising in energy networks, robotics, and other practical situations. A distributed decomposition method for solving the two-stage problem is proposed and it is applied to a problem arising in communication networks. We have used this problem to compare the methods of systemic risk evaluation. We show that the proposed risk aggregation leads to less conservative risk evaluation and results in a substantially better solution of the problem at hand as compared to an aggregation of the risk of individual agents and other methods.

Keywords stochastic programming, risk of complex systems, risk measures for multivariate risk, distributed risk-averse optimization, optimal wireless information exchange

1 Introduction

Evaluation of the risk of a system consisting of multiple agents is one of the fundamental problems relevant to many fields. A crucial question is the assessment of the total risk of the system taking into account the risk of each agent and its contribution to the total risk. Another issue arises when the risk evaluation is based on confidential or proprietary information. There is extensive literature addressing the properties of risk measures and their use in finance. Our goal is to address situations related to robotics, energy systems, business systems, logistic problems, etc. The analysis in financial literature may not be applicable in such situations due to the heterogeneity of the sources of risk, the nature, and the complexity of relations in those systems. In many systems, the source of risk is associated with highly non-trivial aggregation of the features of its agents, which may not be available in an analytical form. For example, in automated robotic systems, the exchange of information may be limited or distorted due to the speed of operation, the distance in space between the agents, or other reasons. Another difficulty associated with the evaluation of risk arises when the risk of one agent stems from various sources of uncertainty of different nature. The question of how to aggregate those risk factors in one loss function does not have a straightforward answer.

The risk of one loss function can be evaluated using a coherent measure of risk such as Average Value-at-Risk, mean-semideviation or others. More traditional (non-coherent) measures of risk such as Value-at-Risk (VaR) are also very popular and frequently used. We refer to [14] for an extensive treatment of risk measures for scalar-valued random variables, as well as to [31] where risk-averse optimization problems are analyzed as well.

The main objective of this paper is to suggest a new approach to the risk of a distributed system and show its viability and potential in application to risk-averse decision problems for distributed multi-agent systems. While building on the developments thus far, our goal is to identify a framework that is theoretically sound but also amenable to efficient numerical computations for risk-averse optimization of large multi-agent systems. We propose a set of axioms for functionals defined on the space of random vectors. The random vector is comprising risk factors of various sources, or is representing the loss of each individual agent in a multi-agent system. While axioms for random vectors have been proposed earlier, our set of axioms differs from those in the literature most notably with respect to the translation equivariance condition, which we explain in due course. The resulting systemic risk measures reduce to coherent measures of risk for scalar-valued random variables when the dimension of the random vectors becomes one. We derive the dual representation of the systemic measures of risk with less assumptions than known for multi-variate risks. In our derivation, we establish one-to-one correspondences between the axioms and properties of the dual variables. We also propose several ways to construct systemic risk measures and analyze their properties. The important features of the proposed measures are the following. They are conformant with the axioms; they can be calculated efficiently, and are amenable to distributed optimization methods.

We have formulated a risk-averse two-stage optimization problem with a structure, which is typical for a system of loosely coupled subsystems. The proposed numerical method is applied to manage the risk of a distributed operation of agents. The distributed method lets each subsystem optimize its operation with minimal information exchange among the other subsystems (agents). This aspect is important for multi-agent systems where some proprietary information is involved or when privacy concerns exist. The method demonstrates that distributed calculation of the systemic risk is possible without a big computational burden. We then consider a two-stage model in wireless communication networks, which extends the static model discussed in [21]. It addresses a situation when a team of robots explores an area and each robot reports relevant information. The goal is to determine a few reporting points so that the communication is conducted most efficiently while managing the risk of losing information. We conduct several numerical experiments to compare various systemic risk measures.

Our paper is organized as follows. In section 2 we provide preliminary information on coherent measures of risk for scalar-valued random variables and survey existing methods for risk evaluation of complex systems. Section 3 contains the set of axioms, the dual representation associated with the resulting systemic risk measures, and two ways to construct such measures in practice. Section 4 provides a theoretical comparison of the new measures of risk to other notions. In particular, we discuss other sets of axioms, explore relations to two notions of multivariate Average Value-at-Risk, and pay attention to the effect of the aggregation of risk before and after risk evaluation. In section 5, we formulate a risk-averse two-stage stochastic programming problem modeling wireless information exchange and seeking to locate a constraint number of information exchange points. We devise a distributed method for solving the problem and report a numerical comparison with several measures of risk, and other systemic measures. We pay attention to the comparison between the principles of aggregation for the purpose of total risk evaluation.

2 Preliminaries

2.1 Coherent risk measures

The widely accepted axiomatic framework for coherent measures of risk was proposed in [2] and further analyzed in [8], [14], [20], [29, 30], [25] and many others works. It is worth noting that another axiomatic approach was initiated in [18] and this line of thinking was developed to an entire framework in [27]. For a detailed exposition, we refer to [31] and the references therein. Let p(Ω,,P)\mathcal{L}_{p}(\Omega,\mathcal{F},P) be the space of real-valued random variables, defined on the probability space (Ω,,P)(\Omega,\mathcal{F},P), that have finite pp-th moments, p[1,)p\in[1,\infty), and are indistinguishable on events with zero probability. We shall assume that the random variables represent random costs or losses. A lower semi-continuous functional ϱ:p(Ω,,P){+}\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\} is a coherent risk measure if it is convex, positively homogeneous, monotonic with respect to the a.s. comparison of random variables, and satisfies the the following translation property

ϱ[X+a]=ϱ[X]+a for all Xp(Ω,,P),a.\varrho[X+a]=\varrho[X]+a\text{ for all }X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P),\;a\in\mathbb{R}.

If ϱ[]\varrho[\cdot] is monotonicity, convex, and satisfies the translation property, then it is called a convex risk measure. Some examples of coherent measures of risk include Average Value-at-Risk (also called Conditional Value-at-Risk) and mean-semideviations measure, which are defined as follows. The Average Value-at-Risk at level α\alpha for a random variable ZZ is defined as

AVaRα[Z]=infη{η+1α𝔼[(Zη)+]}\operatorname{AVaR}_{\alpha}[Z]=\inf_{\eta\in\mathbb{R}}\Big{\{}\eta+\frac{1}{\alpha}\mathbb{E}\big{[}(Z-\eta)_{+}\big{]}\Big{\}}

It is a special case of the higher-order measures of risk:

ϱ[Z]=mint{t+1α(Zt)+p},α(0,1),\varrho[Z]=\min_{t\in\mathbb{R}}\bigg{\{}t+\frac{1}{\alpha}\big{\|}\big{(}Z-t\big{)}_{+}\big{\|}_{p}\bigg{\}},\quad\alpha\in(0,1),

where p\|\cdot\|_{p} refers to the norm in p(Ω,,P)\mathcal{L}_{p}(\Omega,\mathcal{F},P). The mean semi-deviation of order pp is given by

ϱ[Z]=𝔼[Z]+ϰ(Z𝔼[Z])+p,ϰ[0,1].\varrho[Z]=\mathbb{E}[Z]+\varkappa\big{\|}\big{(}Z-\mathbb{E}[Z]\big{)}_{+}\big{\|}_{p},\quad\varkappa\in[0,1].

The space p(Ω,,P)\mathcal{L}_{p}(\Omega,\mathcal{F},P) equipped with its norm topology is paired with the space q(Ω,,P)\mathcal{L}_{q}(\Omega,\mathcal{F},P) equipped with the weak topology where 1p+1q=1\frac{1}{p}+\frac{1}{q}=1. For any Z𝒵Z\in\mathcal{Z} and ξ𝒵\xi\in\mathcal{Z}^{*}, we use the bilinear form:

ξ,Z=Ωξ(ω)Z(ω)𝑑P(ω).\langle\xi,Z\rangle=\int_{\Omega}\xi(\omega)Z(\omega)dP(\omega).

The following result is known as a dual representation of coherent measures of risk. A proper lower semicontinuous coherent risk measure ϱ\varrho has a dual representation

ϱ[Z]=supξ𝒜ϱξ,Z,Z𝒵,\varrho[Z]=\sup_{\xi\in\mathcal{A}_{\varrho}}\langle\xi,Z\rangle,\quad Z\in\mathcal{Z}, (1)

where 𝒜ϱ{ξ𝒵|ξ0,Ωξ(ω)P(dω)=1}\mathcal{A}_{\varrho}\subset\{\xi\in\mathcal{Z}^{*}~{}|~{}\xi\geq 0,~{}\int_{\Omega}\xi(\omega)P(d\omega)=1\} is the convex-analysis subdifferential ϱ[0]\partial\varrho[0].

Risk measures have also been defined by specifying a set of desired values for the random quantity in question; this set is called an acceptance set. Denoting the acceptance set by 𝒦\mathcal{K}\subset\mathbb{R}, the risk of a random outcome ZZ is defined as:

ϱ𝒦[X]=inf{m|Xm𝒦}.\varrho_{\mathcal{K}}[X]=\inf\{m\in\mathbb{R}~{}|~{}X-m\in\mathcal{K}\}. (2)

In finance, this notion of risk is interpreted as the minimum amount of capital that needs to be invested to make the final position acceptable. It is easy to verify that ϱ[]\varrho[\cdot] in (2) is a coherent measure if and only if 𝒦\mathcal{K} is a convex cone (cf. [13]).

2.2 Risk measures for complex systems

As the risk is not additive, when we deal with distributed complex systems, we need to address the question of risk evaluation for the entire system. This risk is usually called systemic in financial literature and the proposed measures for its evaluation are termed systemic risk measures.

Assume that the system consists of mm agents. One approach to evaluating the risk of a system is to use an aggregation function, Λ:m\Lambda:\mathbb{R}^{m}\to\mathbb{R}, and univariate risk measures. Let Xp(Ω,,P;m)X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) be an mm-dimensional random vector comprising the costs incurred by the system, where each component XiX_{i} corresponds to the costs of one agent. The first approach to systemic risk is to choose a univariate risk measure ϱ0\varrho_{0} and apply it to the aggregated cost Λ(X).\Lambda(X). If we prefer to use an acceptance set 𝒦\mathcal{K} as in (2), the systemic risk can be defined as:

ϱ[X]=ϱ0[Λ(X)]=inf{z|Λ(X)z𝒦}.\displaystyle\varrho[X]=\varrho_{0}[\Lambda(X)]=\inf\big{\{}z\in\mathbb{R}~{}|~{}\Lambda(X)-z\in\mathcal{K}\big{\}}. (3)

In ([7] this point of view is analyzed in finite probability spaces and it is shown that any monotonic, convex, positively homogeneous function provides a risk evaluation as in (3) as long as it is consistent with the preferences represented in the definition of 𝒦\mathcal{K}. The point of view presented in definition (3) is further extended in [11], where the authors analyzed convex risk measures defined on a general measurable space and proposed examples of aggregation functions suitable for a financial system. In both studies, the structural decomposition of the systemic risk measure (3) is established when the aggregation function Λ\Lambda satisfies properties similar to the axioms postulated for risk measures. In [4], the authors considered a particular case of an aggregation function, proposing an evaluation method for the risk associated with the cumulative externalities or costs endured by financial institutions. Note that these evaluation methods rely on a choice of one aggregation function suitable for a specific problem.
The translation property for constant vectors is introduced in [5] for convex risk measures defined for bounded random vectors. This property differs from the one we propose here. The authors analyzed the maximal risk over a class of aggregation functions rather than using one specific function. We refer to [28] for an overview of the risk measures constructed this way. A similar approach is taken in [10], where law-invariant risk measures for bounded random vectors are investigated for the purpose of obtaining a Kusuoka representation. The axioms proposed in [5, 10] are closest to ours and we provide more detailed discussion in section 3.

Another approach to risk evaluation of complex systems consists of evaluation of the risk of individual agents first and aggregation of the obtained values next. This method is used, for example, in [3] and [12]. Using the notion of acceptance sets the systemic risk measure is defined in [3] in the following way:

ϱ[X]\displaystyle\varrho[X] =ϱ0[Λ(X)]=inf{i=1mzi|zm,Λ(Xz)𝒦m}.\displaystyle=\varrho_{0}[\Lambda(X)]=\inf\Big{\{}\sum_{i=1}^{m}z_{i}~{}|~{}z\in\mathbb{R}^{m},~{}\Lambda(X-z)\in\mathcal{K}\subset\mathbb{R}^{m}\Big{\}}.

The proposed measures of risk in section 3 also accommodate this point of view. A further extension in [3] replaces the constant vector zmz\in\mathbb{R}^{m} by a random vector Y𝒞Y\in\mathcal{C}, where 𝒞\mathcal{C} is a given set of admissible allocations. This formulation of the risk measure allows to decide scenario-dependent allocations, where the total amount i=1mzi\sum_{i=1}^{m}z_{i} can be determined ahead of time while individual allocations ziz_{i} may be decided in the future when uncertainty is revealed. In [12] a set-valued counterpart of this approach is proposed by defining the systemic risk measure as the set of all vectors that make the outcome acceptable. Once the set of all acceptable allocations is constructed, one can derive a scalar-valued efficient allocation rule by minimizing the weighted sum of components of the vectors in the set. Set-valued risk measures were proposed in [17], see also [1, 16] for duality theory including the dual representation for certain set-valued risk measures. In fast majority of literature, the systemic risk depends on the choice of the aggregation function Λ\Lambda and how well it captures the interdependence between the components. To capture the dependence, an approach based on copula theory was put forward in [24]. It is assumed that independent operation does not carry systemic risk and, hence, the local risk can be optimized by each agent independently. The systemic risk measures are then constructed based on the copulas of the distributions.

Another line of work includes methods that use some multivariate counterpart of the univariate risk measures. The main notion here is the Multivariate Value-at-Risk (MVaR\operatorname{MVaR}) for random vectors, which is identified with the set of pp-efficient points. Let FX()F_{X}(\cdot) be the right-continuous distribution function of a random vector XX with realizations in m\mathbb{R}^{m}. A pp-efficient points for XX is a point vmv\in\mathbb{R}^{m} such that FX(v)pF_{X}(v)\geq p and there is no point zz that satisfies FX(z)pF_{X}(z)\geq p with zvz\leq v componentwise. This notion plays a key role in optimization problems with chance constraints (see e.g. [31]). Multivariate Value-at-Risk satisfies the properties of translation equivariance, positive homogeneity and monotonicity. This notion is used to define Average Value-at-Risk for multivariate distributions (MAVaR\operatorname{MAVaR}) in [19, 23, 26]. Let Zp{Z}_{p} be the set of all points, each of which is component-wise larger than some pp-efficient point:

Zp=sMVaRp(X)(s++m).Z_{p}=\bigcup_{s\in\operatorname{MVaR}_{p}(X)}(s+\mathbb{R}^{m}_{+}).

In [19], Lee and Prekopa define the MAVaR\operatorname{MAVaR} of a random vector XX at level pp as

MAVaRp(X)=𝔼(Λ(X)|XDp},\operatorname{MAVaR}_{p}(X)=\mathbb{E}(\Lambda(X)~{}|~{}X\in D_{p}\}, (4)

where Λ\Lambda is assumed integrable with respect to FXF_{X}, i.e., 𝔼(Λ(X))\mathbb{E}(\Lambda(X)) is finite. It is shown in [19] that MAVaR\operatorname{MAVaR} is translation equivariant, positive homogeneous and subadditive only when all of the components of the random vector are independent.

While the definition of MAVaR\operatorname{MAVaR} above is scalar-valued, in [22] the authors define a Multivariate Average Value-at-Risk (MAVaR\operatorname{MAVaR}) using the notion of pp-efficient points as MVaRp(X)\operatorname{MVaR}_{p}(X) and the extremal representation of the Average Value-at-Risk. First for given probability p(0,1)p\in(0,1), we consider the vectors

MAVaRp(X;v)=v+1p𝔼[(Xv)+],\operatorname{MAVaR}_{p}(X;v)=v+\frac{1}{p}\mathbb{E}[(X-v)_{+}],

where [(Xv)+]i=max(0,Xivi)[(X-v)_{+}]_{i}=\max(0,X_{i}-v_{i}), i=1,,mi=1,\dots,m. Then, the following vector-optimization problem is solved:

VMAVaRp(X)=min{MAVaRp(X;v):vMVaRp(X)}.\operatorname{VMAVaR}_{p}(X)=\min\{\operatorname{MAVaR}_{p}(X;v):v\in\operatorname{MVaR}_{p}(X)\}.

The vector-valued Multivariate Average Value-at-Risk is monotonic, positively homogeneous, translation equivariant, but is not subadditive. Note that in both MVaR\operatorname{MVaR} and MAVaR\operatorname{MAVaR}, one needs to use a scalarization function to obtain a scalar value for the risk.

We shall compare our proposal to the aforementioned risk measures in section 4.

3 Axiomatic Approach to Risk Measures for Random Vectors

In this section, we propose a set of axioms to measures of risk for random vectors with realizations in m\mathbb{R}^{m}. This framework is analogous to the coherent risk measures properties for scalar-valued random variables. In fact, if m=1m=1, the proposed set of axioms exactly coincides with those in [31]. We denote by 𝒵=p(Ω,,P;m)\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) be the space of random vectors with realizations in m\mathbb{R}^{m}, defined on (Ω,,P)(\Omega,\mathcal{F},P). Throughout the paper, we shall consider risk measure ϱ\varrho for random vectors in 𝒵\mathcal{Z} to be a lower-semi-continuous functional ϱ:𝒵{+}\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\} with non-empty domain. We denote the mm-dimensional vector, whose components are all equal to one by 𝟏\mathbf{1} and the random vector with realizations equal to 𝟏\mathbf{1} by 𝕀\mathbb{I}.

Definition 1.

A lower semi-continuous functional ϱ:𝒵{+}\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\} is a coherent risk measure with preference to small outcomes, iff it satisfies the following axioms:

  • A1.

    Convexity: For all X,Y𝒵X,Y\in\mathcal{Z} and α(0,1)\alpha\in(0,1), we have:

    ϱ[αX+(1α)Y]αϱ[X]+(1α)ϱ[Y].~{}\varrho[\alpha X+(1-\alpha)Y]\leq\alpha\varrho[X]+(1-\alpha)\varrho[Y].
  • A2.

    Monotonicity: For all X,Y𝒵X,Y\in\mathcal{Z}, if XiYiX_{i}\geq Y_{i} for all components i=1,,mi=1,\dots,m PP-a.s., then ϱ[X]ϱ[Y]\varrho[X]\geq\varrho[Y].

  • A3.

    Positive homogeneity: For all X𝒵X\in\mathcal{Z} and t>0t>0, we have ϱ[tX]=tϱ[X]\varrho[tX]=t\varrho[X].

  • A4.

    Translation equivariance: For all X𝒵X\in\mathcal{Z} and aa\in\mathbb{R}, we have ϱ[X+a𝕀]=ϱ[X]+aϱ[𝕀].\varrho[X+a\mathbb{I}]=\varrho[X]+a\varrho[\mathbb{I}].

A lower semi-continuous functional ϱ:𝒵{+}\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\} is a convex risk measure with preference to small outcomes, iff it satisfies axioms A1, A2, and A4.

The axioms of convexity and positive homogeneity are defined in a similar way to the properties of coherent risk measures, while the random vectors are now compared component-wise for the property of monotonicity. The main difference is the definition of a translation equivariance axiom. It suggests that if the random loss increases by a constant amount for all components, then the risk should also increase by the same amount. These axioms differ from the previous axioms proposed in the literature.

3.1 Dual representation

In order to derive a dual representation of the multivariate risk measure, we pair the space of random vectors 𝒵=p(Ω,,P;m)\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}), p[1,)p\in[1,\infty) with the space 𝒵=q(Ω,,P;m)\mathcal{Z}^{*}=\mathcal{L}_{q}(\Omega,\mathcal{F},P;\mathbb{R}^{m}), where q(1,)q\in(1,\infty) is such that 1p+1q=1\frac{1}{p}+\frac{1}{q}=1, q=q=\infty for p=1p=1. For X𝒵X\in\mathcal{Z} and ζ𝒵\zeta\in\mathcal{Z}^{*} the bilinear form ,\langle\cdot,\cdot\rangle on the product space 𝒵×𝒵\mathcal{Z}\times\mathcal{Z}^{*} is defined as follows:

ζ,X=Ωζ(ω),X(ω)𝑑P(ω).\langle\zeta,X\rangle=\int_{\Omega}\langle\zeta(\omega),X(\omega)\rangle dP(\omega).

The Fenchel conjugate function ϱ:𝒵{+}\varrho^{*}:\mathcal{Z}^{*}\to\mathbb{R}\cup\{+\infty\} of the risk measure ϱ\varrho is given by

ϱ[ζ]=supX𝒵{ζ,Xϱ[X]},\varrho^{*}[\zeta]=\sup_{X\in\mathcal{Z}}\big{\{}\langle\zeta,X\rangle-\varrho[X]\big{\}},

and the conjugate of ϱ\varrho^{*} (the bi-conjugate function) is

ϱ[X]=supζ𝒵{ζ,Xϱ[ζ]}.\varrho^{**}[X]=\sup_{\zeta\in\mathcal{Z}^{*}}\big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\big{\}}.

Fenchel-Moreau theorem implies that if ϱ[]\varrho[\cdot] is convex and lower semicontinuous, then ϱ=ϱ\varrho^{**}=\varrho and that

ϱ[X]=supζ𝒜~{ζ,Xϱ[ζ]}\varrho[X]=\sup_{\zeta\in\tilde{\mathcal{A}}}\big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\big{\}} (5)

where 𝒜~=dom(ϱ)\tilde{\mathcal{A}}=\operatorname{dom}(\varrho^{*}) is the domain of the conjugate function ϱ\varrho^{*}. Then based on the Fenchel-Moreau theorem and the axioms proposed in this paper, we show the following theorem.

Theorem 1.

Suppose ϱ:𝒵¯{+}\varrho:\mathcal{Z}\to\bar{\mathbb{R}}\cup\{+\infty\} is a convex and lower semicontinuous risk functional. Then the following holds:

  • (i)

    Property A2 is satisfied if and only if ζ0\zeta\geq 0 a.s. for all ζ\zeta in the domain of ϱ\varrho^{*}.

  • (ii)

    Property A3 is satisfied if and only if ϱ\varrho^{*} is the indicator function of ϱ[0]\partial\varrho[0], i.e.

    ϱ[X]=supζ𝒜{ζ,X}.\varrho[X]=\sup_{\zeta\in\mathcal{A}}\{\langle\zeta,X\rangle\}. (6)
  • (iii)

    Property A4 is satisfied if and only if ϱ[𝕀]=𝕀,μζ\varrho[\mathbb{I}]=\langle\mathbb{I},\mu_{\zeta}\rangle for all ζ𝒜\zeta\in\mathcal{A}, where μζ=Ωζ(ω)P(dω)\mu_{\zeta}=\int_{\Omega}\zeta(\omega)P(d\omega).

Proof.

Since ϱ[]\varrho[\cdot] is convex and lower semicontinuous and we have assumed that it has non-empty domain, the representation (5) holds by virtue of the Fenchel-Moreau theorem.

(i) Suppose ϱ\varrho satisfies the monotonicity condition. Assume that ζi(ω)<0\zeta_{i}(\omega)<0 for ωΔ\omega\in\Delta\in\mathcal{F} with P(Δ)>0P(\Delta)>0 for some component i=1,,mi=1,\dots,m. Define X¯i\bar{X}_{i} equal to the indicator function of the event Δ\Delta and X¯j=0\bar{X}_{j}=0 for ji,j=1,,mj\neq i,j=1,\dots,m. Take any XX with support in Δ\Delta such that ϱ[X]\varrho[X] is finite and define Xt:=XtX¯X_{t}:=X-t\bar{X}. Then for t0t\geq 0, we have that XXtX\succeq X_{t}, and ϱ[X]ϱ[Xt]\varrho[X]\geq\varrho[X_{t}] by monotonicity. Consequently,

ϱ[ζ]\displaystyle\varrho^{*}[\zeta] supt+{ζ,Xtϱ[Xt]}supt+{ζ,Xtζ,X¯ϱ[X]}\displaystyle\geq\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X_{t}\rangle-\varrho[X_{t}]\Big{\}}\geq\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X\rangle-t\langle\zeta,\bar{X}\rangle-\varrho[X]\Big{\}}
=supt+{ζ,XtΔζi(ω)P(dω)ϱ[X]}=+.\displaystyle=\sup_{t\in\mathbb{R}_{+}}\Big{\{}\langle\zeta,X\rangle-t\int_{\Delta}\zeta_{i}(\omega)P(d\omega)-\varrho[X]\Big{\}}=+\infty.

It follows that ϱ[ζ]=+\varrho^{*}[\zeta]=+\infty for every ζ𝒵\zeta\in\mathcal{Z}^{*} with at least one negative component, thus ζdomϱ\zeta\notin\operatorname{dom}\varrho^{*}. Conversely, suppose that ζ𝒵\zeta\in\mathcal{Z}^{*} has realizations in m\mathbb{R}^{m} with nonnegative components PP-a.s. Then whenever XXX\succeq X^{\prime}, we have:

ζ,X\displaystyle\langle\zeta,X\rangle =Ωζ(ω),X(ω)𝑑P(ω)Ωζ(ω),X(ω)𝑑P(ω)=ζ,X.\displaystyle=\int_{\Omega}\langle\zeta(\omega),X(\omega)\rangle dP(\omega)\geq\int_{\Omega}\langle\zeta(\omega),X^{\prime}(\omega)\rangle dP(\omega)=\langle\zeta,X^{\prime}\rangle.

Consequently,

ϱ[X]\displaystyle\varrho[X] =supζ𝒵{ζ,Xϱ[ζ]}supζ𝒵{ζ,Xϱ(ζ)}=ϱ[X].\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]\Big{\}}\geq\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X^{\prime}\rangle-\varrho^{*}(\zeta)\Big{\}}=\varrho[X^{\prime}].

Hence, the monotonicity condition holds.

(ii) Suppose the positive homogeneity property holds. If ϱ[tX]=tϱ[X]\varrho[tX]=t\varrho[X] for all X𝒵X\in\mathcal{Z}, then for any fixed t>0t>0, we get

ϱ[ζ]=supX𝒵{ζ,Xϱ[X]}=supX𝒵{ζ,tXϱ[tX]}=supX𝒵t{ζ,Xϱ[X]}=tϱ[ζ].\varrho^{*}[\zeta]=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X\rangle-\varrho[X]\Big{\}}=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,tX\rangle-\varrho[tX]\Big{\}}=\sup_{X\in\mathcal{Z}}t\Big{\{}\langle\zeta,X\rangle-\varrho[X]\Big{\}}=t\varrho^{*}[\zeta].

Hence, if ϱ[ζ]\varrho^{*}[\zeta] is finite, then ϱ[ζ]=0\varrho^{*}[\zeta]=0 as claimed. Conversely, if ϱ[X]=supζdomϱζ,X\varrho[X]=\sup_{\zeta\in\operatorname{dom}\varrho^{*}}\langle\zeta,X\rangle, then ϱ\varrho is positively homogeneous as a support function of a convex set.

(iii) Suppose the translation property is satisfied, i.e.
ϱ[X+t𝕀]=ϱ[X]+tϱ[𝕀]\varrho[X+t\mathbb{I}]=\varrho[X]+t\varrho[\mathbb{I}] for any X𝒵X\in\mathcal{Z} and a constant tt\in\mathbb{R}. Then for any kk\in\mathbb{R} and ζ𝒵\zeta\in\mathcal{Z}^{*}, we get:

ϱ[ζ]\displaystyle\varrho^{*}[\zeta] =supX𝒵{ζ,X+k𝕀ϱ[X+k𝕀]}=supX𝒵{Ωζ(ω),X(ω)+k𝕀P(dω)ϱ[X]kϱ[𝕀]}\displaystyle=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X+k\mathbb{I}\rangle-\varrho[X+k\mathbb{I}]\Big{\}}=\sup_{X\in\mathcal{Z}}\Big{\{}\int_{\Omega}\langle\zeta(\omega),X(\omega)+k\mathbb{I}\rangle P(d\omega)-\varrho[X]-k\varrho[\mathbb{I}]\Big{\}}
=supX𝒵{ζ,X+kΩ𝕀,ζ(ω)P(dω)ϱ[X]kϱ[𝕀]}=ϱ[ζ]+k(Ω𝕀,ζ(ω)P(dω)ϱ[𝕀]).\displaystyle=\sup_{X\in\mathcal{Z}}\Big{\{}\langle\zeta,X\rangle+k\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho[X]-k\varrho[\mathbb{I}]\Big{\}}=\varrho^{*}[\zeta]+k\Big{(}\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho[\mathbb{I}]\Big{)}.

If ϱ[ζ]\varrho^{*}[\zeta] is finite, then ϱ[𝕀]=Ω𝕀,ζ(ω)P(dω)\varrho[\mathbb{I}]=\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega).

Let us denote μζ=Ωζ(ω)P(dω)\mu_{\zeta}=\int_{\Omega}\zeta(\omega)P(d\omega), then we obtain

ϱ[𝕀]=𝕀,Ωζ(ω)P(dω)=𝕀,μζfor all ζ𝒜.\varrho[\mathbb{I}]=\langle\mathbb{I},\int_{\Omega}\zeta(\omega)P(d\omega)\rangle=\langle\mathbb{I},\mu_{\zeta}\rangle\quad\text{for all }\zeta\in\mathcal{A}. (7)

Conversely, suppose ϱ[𝕀]=𝕀,μζ\varrho[\mathbb{I}]=\langle\mathbb{I},\mu_{\zeta}\rangle. Then for any X𝒵X\in\mathcal{Z} and kk\in\mathbb{R}:

ϱ[X+k𝕀]\displaystyle\varrho[X+k\mathbb{I}] =supζ𝒵{ζ,X+k𝕀ϱ[ζ]}=supζ𝒵{Ωζ(ω),X(ω)+k𝕀P(dω)ϱ[ζ]}\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X+k\mathbb{I}\rangle-\varrho^{*}[\zeta]\Big{\}}=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\int_{\Omega}\langle\zeta(\omega),X(\omega)+k\mathbb{I}\rangle P(d\omega)-\varrho^{*}[\zeta]\Big{\}}
=supζ𝒵{ζ,X+kΩ𝕀,ζ(ω)P(dω)ϱ[ζ]}=supζ𝒵{ζ,Xϱ[ζ]+kϱ[𝕀]}\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X\rangle+k\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle P(d\omega)-\varrho^{*}[\zeta]\Big{\}}=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\langle\zeta,X\rangle-\varrho^{*}[\zeta]+k\varrho[\mathbb{I}]\Big{\}}
=ϱ[X]+kϱ[𝕀].\displaystyle=\varrho[X]+k\varrho[\mathbb{I}].

Hence, the translation property is satisfied. ∎

It follows from Theorem 1 that if a risk measure ϱ\varrho is lower semicontinuous and satisfies the axioms of monotonicity, convexity and translation equivariance, then representation (6) holds with the set 𝒜\mathcal{A} defined as:

𝒜={ζ𝒵:Ωζ(ω)𝑑P(ω)=μζ,ζ0}.\mathcal{A}=\Big{\{}\zeta\in\mathcal{Z}^{*}~{}:~{}\int_{\Omega}\zeta(\omega)dP(\omega)=\mu_{\zeta},~{}\zeta\succeq 0\Big{\}}.
Corollary 1.

If a risk measure ϱ[]\varrho[\cdot] is coherent, then

ϱ[0]=0and𝒜=ϱ[0].\varrho[0]=0\quad\text{and}\quad\mathcal{A}=\partial\varrho[0].
Proof.

If ϱ\varrho is also positive homogeneous, then ϱ\varrho is the support function of 𝒜=dom(ϱ)\mathcal{A}=\operatorname{dom}(\varrho^{*}). Then

ϱ[0]=supζ𝒵{0,Xϱ[ζ]}=0.\varrho[0]=\sup_{\zeta\in\mathcal{Z}^{*}}\bigg{\{}\langle 0,X\rangle-\varrho^{*}[\zeta]\bigg{\}}=0.

To show the form of the set 𝒜\mathcal{A} recall that

ϱ[0]\displaystyle\partial\varrho[0] ={ζ𝒵:ζ,X0ϱ[X]ϱ[0]}={ζ𝒵:ζ,Xϱ[X]}.\displaystyle=\{\zeta\in\mathcal{Z}^{*}:\langle\zeta,X-0\rangle\leq\varrho[X]-\varrho[0]\}=\{\zeta\in\mathcal{Z}^{*}:\langle\zeta,X\rangle\leq\varrho[X]\}.

Hence, for all ζ𝒜\zeta\in\mathcal{A}, (6) implies that ζϱ[0]\zeta\in\partial\varrho[0]. On the other hand, if ζϱ[0]\zeta\in\partial\varrho[0], then ζ𝒜\zeta\in\mathcal{A} by the definition of a support function. ∎

We shall consider further the following property.

  • Normalization a coherent measure of risk ϱ:𝒵{+}\varrho:\mathcal{Z}\to\mathbb{R}\cup\{+\infty\} is normalized iff ϱ[𝕀]=1\varrho[\mathbb{I}]=1.

Corollary 2.

For a normalized coherent measure of risk ϱ[]\varrho[\cdot], we have Ω𝟏,ζP(dω)=1\int_{\Omega}\langle\mathbf{1},\zeta\rangle P(d\omega)=1.

Proof.

It follows from equation (7) that 𝕀,μζ=1\langle\mathbb{I},\mu_{\zeta}\rangle=1, as stated. This entails that for all ζ𝒜\zeta\in\mathcal{A}, ζP\zeta P can be interpreted as a probability measure on the space Ω×{1,2,,m}\Omega\times\{1,2,\dots,m\}. ∎

In the paper [5], the authors have adopted the following translation axiom:

T. For any constant α\alpha\in\mathbb{R} and any vector eie^{i} whose ii-th component is 1 (i=1,,mi=1,\dots,m) and all other components are zero, we have ϱ[X+αei]=ϱ[X]+α.\varrho[X+\alpha e^{i}]=\varrho[X]+\alpha.

Theorem 2.

Assume that ϱ\varrho is a proper lower-semicontinuous convex risk functional. Property T holds if and only if Ωζi(ω)𝑑P(ω)=1\int_{\Omega}\zeta_{i}(\omega)dP(\omega)=1 for all i=1,,mi=1,\dots,m for all ζdomϱ\zeta\in\operatorname{dom}\varrho^{*}. Furthermore, if property T holds, then

Ω𝕀,ζ(ω)𝑑P(ω)=ϱ[𝕀]=m,Ωζ(ω)𝑑P(ω)=𝟏,for all ζdomϱ;\displaystyle\int_{\Omega}\langle\mathbb{I},\zeta(\omega)\rangle dP(\omega)=\varrho[\mathbb{I}]=m,\quad\int_{\Omega}\zeta(\omega)dP(\omega)=\mathbf{1},\quad\text{for all }\zeta\in\operatorname{dom}\varrho^{*}; (8)
ϱ[X+a]=ϱ[X]+ϱ[a]for all X𝒵,am.\displaystyle\varrho[X+a]=\varrho[X]+\varrho[a]\quad\text{for all }X\in\mathcal{Z},\;a\in\mathbb{R}^{m}. (10)
Proof.

Suppose T holds. Then for a random vector XX in the domain of ϱ\varrho and every ζ𝒵\zeta\in\mathcal{Z}^{*}, we have

ϱ[ζ]\displaystyle\varrho^{*}[\zeta]\geq supα{ζ,X+αeiϱ[X+αei]}=supα{Ωαζi(ω)P(dω)+ζ,Xϱ[X]α}\displaystyle\sup_{\alpha\in\mathbb{R}}\big{\{}\langle\zeta,X+\alpha e^{i}\rangle-\varrho[X+\alpha e^{i}]\big{\}}=\sup_{\alpha\in\mathbb{R}}\Big{\{}\int_{\Omega}\alpha\zeta_{i}(\omega)P(d\omega)+\langle\zeta,X\rangle-\varrho[X]-\alpha\Big{\}}
=supαα{Ωζi(ω)P(dω)1}+ζ,Xϱ[X].\displaystyle=\sup_{\alpha\in\mathbb{R}}\alpha\Big{\{}\int_{\Omega}\zeta_{i}(\omega)P(d\omega)-1\Big{\}}+\langle\zeta,X\rangle-\varrho[X].

It follows that ϱ[ζ]=+\varrho^{*}[\zeta]=+\infty for any ζ𝒵\zeta\in\mathcal{Z}^{*} such that Ωζi(ω)P(dω)1\int_{\Omega}\zeta_{i}(\omega)P(d\omega)\not=1. This entails that for every constant vector ama\in\mathbb{R}^{m}, the risk value is

ϱ[a]=ϱ[i=1maiei]=i=1mai.\varrho[a]=\varrho\Big{[}\sum_{i=1}^{m}a_{i}e^{i}\Big{]}=\sum_{i=1}^{m}a_{i}. (12)

The other direction is straightforward. Indeed,

ϱ[X+αei]\displaystyle\varrho[X+\alpha e^{i}] =supζ𝒵{ζ,X+αeiϱ[ζ]}=supζ𝒵{Ωαζi(ω)P(dω)+ζ,Xϱ[ζ]}\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\big{\{}\langle\zeta,X+\alpha e^{i}\rangle-\varrho^{*}[\zeta]\big{\}}=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\int_{\Omega}\alpha\zeta_{i}(\omega)P(d\omega)+\langle\zeta,X\rangle-\varrho^{*}[\zeta]\Big{\}}
=supζ𝒵{α+ζ,Xϱ[ζ]}=ϱ[X]+α.\displaystyle=\sup_{\zeta\in\mathcal{Z}^{*}}\Big{\{}\alpha+\langle\zeta,X\rangle-\varrho^{*}[\zeta]\Big{\}}=\varrho[X]+\alpha.

Additionally, property T also implies that

Ωζ(ω)𝑑P(ω)=𝟏,Ω𝟏,ζ(ω)𝑑P(ω)=m=ϱ[𝕀].\int_{\Omega}\zeta(\omega)dP(\omega)=\mathbf{1},\quad\int_{\Omega}\langle\mathbf{1},\zeta(\omega)\rangle dP(\omega)=m=\varrho[\mathbb{I}].

Due to equation (12), for all X𝒵X\in\mathcal{Z} and aa\in\mathbb{R}, we obtain

ϱ[X+a]=ϱ[X+i=1maiei]=ϱ[X]+i=1mai=ϱ[X]+ϱ[a],\displaystyle\varrho[X+a]=\varrho[X+\sum_{i=1}^{m}a_{i}e^{i}]=\varrho[X]+\sum_{i=1}^{m}a_{i}=\varrho[X]+\varrho[a],

which completes the proof. ∎

We also observe that a particular implication of Theorem 2 is that risk measures are linear on constant vectors.

Corollary 3.

If a coherent measure of risk ϱ[]\varrho[\cdot] satisfies property T, then it is linear on constant vectors.

Proof.

Indeed, a special case of (LABEL:e:T-randomsum) shows that

ϱ[a+b]=i=1m(ai+bi)=ϱ[a]+ϱ[b]for all a,bm\varrho[a+b]=\sum_{i=1}^{m}(a_{i}+b_{i})=\varrho[a]+\varrho[b]\quad\text{for all }a,b\in\mathbb{R}^{m}

This combined with the fact that ϱ[0]=0\varrho[0]=0 and the positive homogeneity of the risk measure proves the statement. ∎

In [10], the authors have analyzed law-invariant risk measures for bounded random vectors. They have introduced a set of axioms that are closest to ours: their axioms include our axioms together with the two normalization properties ϱ[𝕀]=1\varrho[\mathbb{I}]=1 and ϱ[0]=0\varrho[0]=0. We do not need these normalization properties to establish the dual representation for general random vectors with finite pp-moments, p1p\geq 1; we derive that the risk of the deterministic zero vector is zero from the dual representation. The property of strong coherence of risk measures, introduced in that paper implies in particular that ϱ[a+b]=ϱ[a]+ϱ[b],\varrho[a+b]=\varrho[a]+\varrho[b], which appears to be a strong assumption.

3.2 Risk measures obtained via sets of linear scalarizations

Suppose we have a random vector X𝒵=p(Ω,,P;m)X\in\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) with a right-continuous distribution function F(X;)F(X;\cdot) and marginal distribution function Fi(Xi;)F_{i}(X_{i};\cdot) of each component i=1,,mi=1,\dots,m. We consider linear scalarization using vectors taken from the simplex

Sm+={cm|i=1mci=1,ci0i=1,,m}.S_{m}^{+}=\{c\in\mathbb{R}^{m}~{}|~{}\sum_{i=1}^{m}c_{i}=1,~{}c_{i}\geq 0~{}\forall i=1,\dots,m\}.

Let ϱ:p(Ω,,P){+}\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\} be a lower semi-continuous risk measure. For any fixed set SSm+S\subset S_{m}^{+}, we define the risk measure

ϱS[X]=ϱ[XS],where XS(ω)=maxcScX(ω),ωΩ.\varrho_{S}[X]=\varrho[X_{S}],\quad\text{where }X_{S}(\omega)=\max_{c\in S}c^{\top}X(\omega),\;\;\omega\in\Omega. (13)

It is straightforward to see that XSp(Ω,,P)X_{S}\in\mathcal{L}_{p}(\Omega,\mathcal{F},P) and hence, the risk measure ϱS[]\varrho_{S}[\cdot] is well-defined on p(Ω,,P;m).\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}).

Theorem 3.

If ϱ:p(Ω,,P){+}\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\} is a coherent (convex) risk measure, then for any set SSm+S\subset S_{m}^{+}, the risk measure ϱS[X]=ϱ[XS]\varrho_{S}[X]=\varrho[X_{S}] is coherent (convex) according to Definition 1.

Proof.

For two random vectors X,Yp(Ω,,P;m)X,Y\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) with XYX\leq Y component-wise a.s., we have cXcYc^{\top}X\leq c^{\top}Y a.s. for all cSm+c\in S_{m}^{+}. This implies that maxcScXmaxcScY\max_{c\in S}c^{\top}X\leq\max_{c\in S}c^{\top}Y a.s. and, hence, ϱ[XS]ϱ[YS]\varrho[X_{S}]\leq\varrho[Y_{S}]. Thus, the monotonicity axiom is satisfied. Given two random vectors X,Y𝒵X,Y\in\mathcal{Z} and α[0,1]\alpha\in[0,1], consider their convex combination αX+(1α)Y\alpha X+(1-\alpha)Y. Due to the convexity and monotonicity of ϱ[]\varrho[\cdot], we have

ϱS[αX+(1α)Y]\displaystyle\varrho_{S}[\alpha X+(1-\alpha)Y] =ϱ[maxcSc(αX+(1α)Y)]ϱ[αmaxcScX+(1α)maxcScY]\displaystyle=\varrho[\max_{c\in S}c^{\top}(\alpha X+(1-\alpha)Y)]\leq\varrho[\alpha\max_{c\in S}c^{\top}X+(1-\alpha)\max_{c\in S}c^{\top}Y]
αϱ[maxcScX]+(1α)ϱ[maxcScY]=αϱS[X]+(1α)ϱS[Y].\displaystyle\leq\alpha\varrho[\max_{c\in S}c^{\top}X]+(1-\alpha)\varrho[\max_{c\in S}c^{\top}Y]=\alpha\varrho_{S}[X]+(1-\alpha)\varrho_{S}[Y].

Thus, the convexity axiom is satisfied. Given a random vector X𝒵X\in\mathcal{Z} and a constant tt\in\mathbb{R}, it follows:

ϱS[X]\displaystyle\varrho_{S}[X] =ϱ[maxcSc(X+t𝕀)]=ϱ[maxcS(cX+tc𝕀)]=ϱ[maxcScX+t]=ϱ[XS]+t\displaystyle=\varrho[\max_{c\in S}c^{\top}(X+t\mathbb{I})]=\varrho[\max_{c\in S}(c^{\top}X+tc^{\top}\mathbb{I})]=\varrho[\max_{c\in S}c^{\top}X+t]=\varrho[X_{S}]+t

Positive homogeneity follows in a straightforward manner. ∎

If the set SS is a singleton, we obtain the following.

Corollary 4.

Let ϱ:p(Ω,,P){+}\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\} be a coherent (convex) risk measure. For any vector c𝒮m+c\in\mathcal{S}_{m}^{+}, the risk measure ϱc[X]=ϱ[cX]\varrho_{c}[X]=\varrho[c^{\top}X] is coherent (convex) according to Definition 1.

Using the dual representation of the coherent risk measure ϱ\varrho for scalar-valued random variables, we obtain the following:

ϱ[cX]\displaystyle\varrho[c^{\top}X] =supξ𝒜ϱΩξ(ω)cX(ω)P(dω)=supζ𝒜~Ωζ(ω)X(ω)P(dω)with 𝒜~={ξc:ξ𝒜ϱ}\displaystyle=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)c^{\top}X(\omega)P(d\omega)=\sup_{\zeta\in\tilde{\mathcal{A}}}\int_{\Omega}\zeta(\omega)X(\omega)P(d\omega)\quad\text{with }\tilde{\mathcal{A}}=\{\xi c:\xi\in\mathcal{A}_{\varrho}\} (14)

Additionally, a measurable selection νX(ω)argmaxcScX(ω)\nu_{X}(\omega)\in\operatorname{\arg\max}_{c\in S}c^{\top}X(\omega) exists by the Kuratowski-Ryll-Nadjevski theorem; we shall use the notation νXS\nu_{X}\in S for any such selection.

ϱ[maxcScX]\displaystyle\varrho[\max_{c\in S}c^{\top}X] =supξ𝒜ϱΩξ(ω)maxcScX(ω)P(dω)=supξ𝒜ϱΩξ(ω)νXX(ω)P(dω)\displaystyle=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)\max_{c\in S}c^{\top}X(\omega)P(d\omega)=\sup_{\xi\in\mathcal{A}_{\varrho}}\int_{\Omega}\xi(\omega)\nu_{X}^{\top}X(\omega)P(d\omega)
=supζ𝒜~Ωζ(ω)X(ω)P(dω)with 𝒜~={ξνX:ξ𝒜ϱ}\displaystyle=\sup_{\zeta\in\tilde{\mathcal{A}}^{\prime}}\int_{\Omega}\zeta(\omega)X(\omega)P(d\omega)\quad\text{with }\tilde{\mathcal{A}}^{\prime}=\{\xi\nu_{X}:\xi\in\mathcal{A}_{\varrho}\}

Notice that the representations just derived have the form of the dual representation in (6), however we have not established that 𝒜~\tilde{\mathcal{A}} coincides with the domain of its conjugate function.

We observe the following properties of the aggregation by a single linear scalarization.

Proposition 1.

Given a coherent risk measure ϱ:𝒵¯\varrho:\mathcal{Z}\to\bar{\mathbb{R}} and a scalarization vector c𝒮m+c\in\mathcal{S}^{+}_{m}, for any random vector Xp(Ω,,P;m)X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) risk of the vector measured by ϱ[cX]\varrho[c^{\top}X] does not exceed the maximal risk of its components measured by ϱ[].\varrho[\cdot]. Furthermore, the following relation between aggregation methods holds ϱ[cX]cϱ[X].\varrho[c^{\top}X]\leq c^{\top}\varrho[X].

Proof.

The dual representation implies the following:

ϱ[cX]\displaystyle\varrho[c^{\top}X] =supξ𝒜Ωi=1mciξ(ω)Xi(ω)P(dω)=supξ𝒜i=1mciΩξ(ω)Xi(ω)P(dω)\displaystyle=\sup_{\xi\in\mathcal{A}}\int_{\Omega}\sum_{i=1}^{m}c_{i}\xi(\omega)X_{i}(\omega)P(d\omega)=\sup_{\xi\in\mathcal{A}}\sum_{i=1}^{m}c_{i}\int_{\Omega}\xi(\omega)X_{i}(\omega)P(d\omega)
i=1msupξ𝒜ciΩξ(ω)Xi(ω)P(dω)=i=1mciϱ[Xi]max1imϱ[Xi].\displaystyle\leq\sum_{i=1}^{m}\sup_{\xi\in\mathcal{A}}c_{i}\int_{\Omega}\xi(\omega)X_{i}(\omega)P(d\omega)=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]\leq\max_{1\leq i\leq m}\varrho[X_{i}].

The penultimate relation implies the second claim of the theorem.

We also show the following useful result, which implies that we can use statistical methods to estimate the systemic risk measure ϱS[X].\varrho_{S}[X].

Proposition 2.

If ϱ:p(Ω,,P){+}\varrho:\mathcal{L}_{p}(\Omega,\mathcal{F},P)\to\mathbb{R}\cup\{+\infty\} is a law-invariant risk measure, then for any set SSm+S\subset S_{m}^{+}, the systemic risk measure ϱS[X]=ϱ[XS]\varrho_{S}[X]=\varrho[X_{S}] is law-invariant.

Proof.

It is sufficient to show that for two random vectors XX and YY, which have the same distribution, the respective random variables XSX_{S} and YSY_{S} have the same distribution. ∎

We observe that cXc^{\top}X and cYc^{\top}Y have the same distribution for any vector cmc\in\mathbb{R}^{m}. Hence, for any real number rr, the following relations hold:

P(XSr)\displaystyle P(X_{S}\leq r) =P(cXr,cS)=P(cYr,cS)=P(YSr),\displaystyle=P(c^{\top}X\leq r,\;\forall c\in S)=P(c^{\top}Y\leq r,\;\forall c\in S)=P(Y_{S}\leq r),

which shows the equality of the distribution functions.

3.3 Systemic Risk Measures Obtained via Nonlinear Scalarization

The second aggregation method that falls within the scope of our axiomatic framework is that of nonlinear scalarization. This class of risk measures cannot be obtained within the framework of aggregations by non-linear functions, and does not fit the axiomatic approaches in [7] or in [5]. Furthermore, we shall see that this method of evaluating systemic risk allows to maintain fairness between the system’s participants.

We define Ωm={1,,m}\Omega_{m}=\{1,\dots,m\} and consider a probability space (Ωm,c,c)(\Omega_{m},\mathcal{F}_{c},c), where cSm+c\in S^{+}_{m} and c\mathcal{F}_{c} contains all subsets of Ωm\Omega_{m}. We view cc as a probability mass function of the space Ωm\Omega_{m}. Given an mm-dimensional random vector X𝒵=p(Ω,,P;m)X\in\mathcal{Z}=\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}) and a collection of mm univariate measures of risk ϱi:(Ω,,P)\varrho_{i}:(\Omega,\mathcal{F},P)\to\mathbb{R}, i=1,,m,i=1,\dots,m, we define the random variable XRX_{R} on the space Ωm\Omega_{m} as follows:

XR(i)=ϱi[Xi],i=1,m.X_{R}(i)=\varrho_{i}[X_{i}],\quad i=1,\dots m. (15)

Choosing a scalar measure of risk ϱ0:(Ωm,c,c)\varrho_{0}:(\Omega_{m},\mathcal{F}_{c},c)\to\mathbb{R}, the measure of systemic risk ϱs:p(Ωm,c,c)\varrho_{s}:\mathcal{L}_{p}(\Omega_{m},\mathcal{F}_{c},c)\to\mathbb{R} is defined as follows:

ϱs[X]=ϱ0[XR].\varrho_{s}[X]=\varrho_{0}[X_{R}]. (16)

This is a nonlinear aggregation of the individual risks ϱ[Xi]\varrho[X_{i}], hence this approach falls within the category of methods that evaluate the risk of each component first and then aggregate their values. The measure ϱs[X]\varrho_{s}[X] satisfies the axioms postulated for systemic risk measures.

Theorem 4.

Suppose the univariate measures of risk ϱi:(Ω,,P)\varrho_{i}:(\Omega,\mathcal{F},P)\to\mathbb{R}, i=1,,mi=1,\dots,m are coherent and let ϱs[]\varrho_{s}[\cdot] be defined as in (16). Then ϱs[]\varrho_{s}[\cdot] satisfies properties (A1)–(A4).

Proof.

(i) Given any X,Y𝒵X,Y\in\mathcal{Z} and α(0,1)\alpha\in(0,1), we consider the random vector Z=αX+(1α)YZ=\alpha X+(1-\alpha)Y. We have ϱi[Zi]αϱi[Xi]+(1α)ϱi[Yi]\varrho_{i}[Z_{i}]\leq\alpha\varrho_{i}[X_{i}]+(1-\alpha)\varrho_{i}[Y_{i}], i=1,,mi=1,\dots,m. Defining a random variable ZZ^{\prime} on Ωm\Omega_{m} by setting Z(i)=αϱi[Xi]+(1α)ϱi[Yi]Z^{\prime}(i)=\alpha\varrho_{i}[X_{i}]+(1-\alpha)\varrho_{i}[Y_{i}], we obtain that ZRZZ_{R}\leq Z^{\prime}. Using the monotonicity and convexity of ϱ0,\varrho_{0}, we obtain

ϱ0[ZR]ϱ0[Z]αϱ0[XR]+(1α)ϱ0[YR].\varrho_{0}[Z_{R}]\leq\varrho_{0}[Z^{\prime}]\leq\alpha\varrho_{0}[X_{R}]+(1-\alpha)\varrho_{0}[Y_{R}].

Hence ϱs[αX+(1α)Y]αϱs[X]+(1α)ϱs[Y]\varrho_{s}[\alpha X+(1-\alpha)Y]\leq\alpha\varrho_{s}[X]+(1-\alpha)\varrho_{s}[Y].

(ii) Suppose the vectors X,Y𝒵X,Y\in\mathcal{Z} satisfy XYX\leq Y a.s. This implies that XiYiX_{i}\leq Y_{i} a.s. and, hence, ϱi[Xi]ϱi[Yi]\varrho_{i}[X_{i}]\leq\varrho_{i}[Y_{i}] for all i=1,,mi=1,\dots,m by the monotonicity property of ϱi\varrho_{i}. This further implies that XRYRX_{R}\leq Y_{R}, entailing that ϱ0[XR]ϱ0[YR]\varrho_{0}[X_{R}]\leq\varrho_{0}[Y_{R}]. Thus (A2) is satisfied.

(iii) Given a random vector X𝒵X\in\mathcal{Z}, t>0t>0, we have

ϱs[tX]=ϱ0[(tX)R]=ϱ0[t(XR)]=tϱ0[XR]\varrho_{s}[tX]=\varrho_{0}[(tX)_{R}]=\varrho_{0}[t(X_{R})]=t\varrho_{0}[X_{R}]

where we have used the positive homogeneity property of ϱi[]\varrho_{i}[\cdot] for all i=0,1,,mi=0,1,\dots,m.

(iv) Given a random vector X𝒵X\in\mathcal{Z} and a constant aa, we have (X+a𝕀)R(i)=ϱi[Xi+a]=ϱi[Xi]+a(X+a\mathbb{I})_{R}(i)=\varrho_{i}[X_{i}+a]=\varrho_{i}[X_{i}]+a. Hence ϱ0[(X+a𝕀)R]=ϱ0[XR]+a.\varrho_{0}[(X+a\mathbb{I})_{R}]=\varrho_{0}[X_{R}]+a. This shows property (A4). ∎

Examples A. Systemic Mean-AVaR measure

Consider the case when ϱ0\varrho_{0} is a convex combination of the expected value and the Average Value-at-Risk at some level α\alpha and all components of XX are evaluated by the same measure of risk ϱ[]\varrho[\cdot]. Then for any κ[0,1]\kappa\in[0,1] and cSm+c\in S^{+}_{m}, we have:

ϱs[X]\displaystyle\varrho_{s}[X] =ϱ0[XR]=(1κ)𝔼[XR]+κAVaRα[XR]\displaystyle=\varrho_{0}[X_{R}]=(1-\kappa)\mathbb{E}[X_{R}]+\kappa\operatorname{AVaR}_{\alpha}[X_{R}]
=(1κ)i=1mciϱ[Xi]+κinfη{η+1αi=1mci(ϱ[Xi]η)+}\displaystyle=(1-\kappa)\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\inf_{\eta\in\mathbb{R}}\Big{\{}\eta+\frac{1}{\alpha}\sum_{i=1}^{m}c_{i}(\varrho[X_{i}]-\eta)_{+}\Big{\}}

Here the infimum with respect to η\eta\in\mathbb{R} is taken over the individual risks of the components ϱ[Xi]\varrho[X_{i}], i=1,,mi=1,\dots,m. Hence, this method of aggregation imposes additional penalties for the components whose risk exceeds some threshold.

B. Systemic Mean-Semideviation measure

Now let ϱ0\varrho_{0} be a Mean-Upper-Semideviation risk measure of the first order and all components of XX are evaluated by the same measure of risk ϱ[]\varrho[\cdot]. Then the measure of systemic risk can be defined as:

ϱs[X]\displaystyle\varrho_{s}[X] =ϱ0[XR]=i=1mciϱ[Xi]+κi=1mci(ϱ[Xi]j=1mcjϱ[Xj])+\displaystyle=\varrho_{0}[X_{R}]=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\sum_{i=1}^{m}c_{i}\Big{(}\varrho[X_{i}]-\sum_{j=1}^{m}c_{j}\varrho[X_{j}]\Big{)}_{+}
=i=1mciϱ[Xi]+κi=1mci(ϱ[Xi]j=1mcjϱ[Xj])+\displaystyle=\sum_{i=1}^{m}c_{i}\varrho[X_{i}]+\kappa\sum_{i=1}^{m}c_{i}\Big{(}\varrho[X_{i}]-\sum_{j=1}^{m}c_{j}\varrho[X_{j}]\Big{)}_{+}

The last representation shows that this risk measure is an aggregation of the individual risk of the components, which compares the risk of every component with the weighted average risk of all components and penalizes the deviation of the individual risk from that average.

The presented method of non-linear aggregation maintains fairness within the system and keeps the components functioning within the same level of risk.

4 Relations to multivariate measures of systemic risk

In this section, we compare the proposed risk measures with the multivariate notions mentioned in section 2.2.

Consider first the Multivariate Value-at-Risk (MVaR\operatorname{MVaR}) is given as the set of pp-efficient points of the respective probability distribution. The following facts are shown in [9]. For every p(0,1)p\in(0,1) the level set 𝒵p\mathcal{Z}_{p} of a the distribution function of a random vector XX is nonempty and closed. For a given scalarization vector c0c\geq 0, the pp-efficient points can be generated by solving the following optimization problem:

min\displaystyle\min cz\displaystyle c^{\top}z (17)
s.t. P(Xz)p.\displaystyle P(X\leq z)\geq p.

For every c0c\geq 0 the solution set of the optimization problem (17) is nonempty and contains a pp-efficient point. Hence, given a random vector X𝒵X\in\mathcal{Z} and a scalarization vector cSm+c\in S_{m}^{+}, MVaR\operatorname{MVaR} at level p(0,1)p\in(0,1) can be calculated as:

MVaRp(X)\displaystyle\operatorname{MVaR}_{p}(X) =inf{cX|P(Xz)p}=inf{cX:X𝒵p}.\displaystyle=\inf\big{\{}c^{\top}X~{}|~{}P(X\leq z)\geq p\big{\}}=\inf\big{\{}c^{\top}X~{}:~{}X\in\mathcal{Z}_{p}\big{\}}.

Therefore, using linear scalarizations, one can find the pp-efficient point corresponding to any given vector cSm+c\in S_{m}^{+}.

Consider now the Multivariate Average Value-at-Risk (MAVaR\operatorname{MAVaR}) defined in (4). When small outcomes are preferred then, the unfavorable set of realizations of a random vector XX is given by the pp-level set of F(X;)F(X;\cdot). Hence MAVaRp(X)=𝔼[ψ(X)|X𝒵p]\operatorname{MAVaR}_{p}(X)=\mathbb{E}[\psi(X)~{}|~{}X\in\mathcal{Z}_{p}]. If X(ω)𝒵pX(\omega)\in\mathcal{Z}_{p}, then there exists a pp-efficient point vmv\in\mathbb{R}^{m} such that X(ω)vX(\omega)\geq v. If the scalarization function ψ(X)\psi(X) is monotonically nondecreasing, then P(ψ(X)ψ(v))pP(\psi(X)\leq\psi(v))\geq p. Denote the pp-quantile of ψ(X)\psi(X) by ηX(p)\eta_{X}(p). Then we observe that ηX(p)minvψ(v).\eta_{X}(p)\leq\min_{v}\psi(v). Therefore:

𝔼[ψ(X)|X𝒵p]\displaystyle\mathbb{E}[\psi(X)~{}|~{}X\in\mathcal{Z}_{p}] =𝔼[ψ(X)|Xv]𝔼[ψ(X)|ψ(X)ηX(p)]\displaystyle=\mathbb{E}[\psi(X)~{}|~{}X\geq v]\geq\mathbb{E}[\psi(X)~{}|~{}\psi(X)\geq\eta_{X}(p)]
=infη{η+1p𝔼[(ψ(X)η)+]}=AVaRp(ψ(X))\displaystyle=\inf_{\eta}\bigg{\{}\eta+\frac{1}{p}\mathbb{E}[(\psi(X)-\eta)_{+}]\bigg{\}}=\operatorname{AVaR}_{p}(\psi(X))

for all p(0,1)p\in(0,1) where the cumulative distribution function of ψ(X)\psi(X) is continuous. It follows that the Average Value-at-Risk of scalarized XX by a monotonically nondecreasing function ψ(X)\psi(X) has a smaller value than MAVaR\operatorname{MAVaR} defined in (4). This implies in particular that for any SSm+S\subset S_{m}^{+}, MAVaRp(X)ϱS[X].\operatorname{MAVaR}_{p}(X)\geq\varrho_{S}[X].

We not turn to the Vector-valued Multivariate Average Value-at-Risk. It is calculated as one of the Pareto-efficient optimal solution of the following optimization problem:

minηm\displaystyle\min_{\eta\in\mathbb{R}^{m}} η+1p𝔼[(Xη)+]\displaystyle\eta+\frac{1}{p}\mathbb{E}[(X-\eta)_{+}] (18)
s.t. P(Xη)1p.\displaystyle P(X\leq\eta)\geq 1-p.

It is well-known that a feasible solution of a convex multiobjective optimization problem is Pareto-efficient if and only if it is an optimal solution of the scalarized problem with an objective function which is a convex combination of the multiple objectives. Then VMAVaR\operatorname{VMAVaR}, which is the Pareto-efficient solution of the multiobjective optimization problem 18, is also optimal for the following problem:

minvm\displaystyle\min_{v\in\mathbb{R}^{m}} cv+1p𝔼[c(Xv)+]\displaystyle c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}] (19)
s.t. P(Xv)1p,\displaystyle P(X\leq v)\geq 1-p,

where cmc\in\mathbb{R}^{m} is a scalarization vector taken from the simplex Sm+S_{m}^{+}.

Now for Xp(Ω,,P;m)X\in\mathcal{L}_{p}(\Omega,\mathcal{F},P;\mathbb{R}^{m}), we consider:

c(Xv)+i=1mmax{0,ci(Xivi)}=max{0,c(Xv)}c^{\top}(X-v)_{+}\geq\sum_{i=1}^{m}\max\{0,c_{i}(X_{i}-v_{i})\}=\max\{0,c^{\top}(X-v)\}

due to the convexity of the max function. It follows that:

infvm{cv+1p𝔼[c(Xv)+]}infvm{cv+1p𝔼[(cXcv)+]}.\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}]\Big{\}}\geq\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v)_{+}]\Big{\}}.

In the scalar-valued case (m=1m=1) the minimizer of the optimization problem defining AVaRp(Z)\operatorname{AVaR}_{p}(Z) is the VaRp(Z)\operatorname{VaR}_{p}(Z) for a random variable ZZ. In the multivariate case (m>1m>1), we established that the solution of (17) is the pp-efficient point, or VaRp(X)\operatorname{VaR}_{p}(X), corresponding to a given scalarization vector cSm+c\in S_{m}^{+}. Denoting this pp-efficient point as v(c)v(c), it follows that:

cv(c)+1p𝔼[c(Xv(c))+]cv(c)+1p𝔼[(cXcv(c))+]c^{\top}v(c)+\frac{1}{p}\mathbb{E}[c^{\top}(X-v(c))_{+}]\geq c^{\top}v(c)+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v(c))_{+}]

The relation P(Xv(c))1pP(X\leq v(c))\geq 1-p implies that P(cXcv(c))1pP(c^{\top}X\leq c^{\top}v(c))\geq 1-p. Denoting the pp-quantile of cXc^{\top}X as ηX(p;c)\eta_{X}(p;c), it follows that: ηX(p;c)cv(c),\eta_{X}(p;c)\leq c^{\top}v(c), i.e. ηX(p;c)\eta_{X}(p;c) is not larger than cv(c)c^{\top}v(c). Therefore:

infvm{cv+1p𝔼[c(Xv)+]:P(Xv)>p}infvm{cv+1p𝔼[(cXcv)+]}=AVaRp(cX).\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[c^{\top}(X-v)_{+}]:P(X\leq v)>p\Big{\}}\\ \geq\inf_{v\in\mathbb{R}^{m}}\Big{\{}c^{\top}v+\frac{1}{p}\mathbb{E}[(c^{\top}X-c^{\top}v)_{+}]\Big{\}}=\operatorname{AVaR}_{p}(c^{\top}X).

It follows that the scalarization of VMAVaR\operatorname{VMAVaR} results in a smaller value of the Average Value-at-Risk of the scalarized random vector, which is one of the systemic risk measures following the constructions in section 3.

We do not pursue further investigation on set-valued systemic measures of risk as their calculation is numerically very expensive.

5 Two-stage stochastic programming problem with systemic risk measures

Our goal is to address a situation, when the agents cooperate on completing a common task and risk is associated (among other factors) with the successful completion of the task. This type of situations are typical in robotics, as well as in energy systems, where the units cover the energy demand in certain area.

5.1 Two-stage monotropic optimization problem with a systemic risk measure

In this section, we consider how the proposed approaches to evaluate systemic risk can be applied to a two-stage stochastic optimization problem with a monotropic structure. Specifically, we focus on a problem formulated as follows:

minx𝒳\displaystyle\min_{x\in\mathcal{X}}~{} f(x)+ϱ[Q(x;ξ)]\displaystyle~{}f(x)+\varrho[Q(x;\xi)] (20)

where Q(x;ξ)Q(x;\xi) has realizations Qs(x;ξs)Q^{s}(x;\xi^{s}) defined as the optimal value of the second-stage problem in scenario sSs\in S:

Qs(x;ξs)=min𝐲,z\displaystyle Q^{s}(x;\xi^{s})=\min_{\mathbf{y},z}~{} i=1mcigis(yi,z)\displaystyle~{}\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z) (21)
s.t. Tisx+Wisyi=his,i=1,,m\displaystyle~{}T_{i}^{s}x+W_{i}^{s}y_{i}=h^{s}_{i},~{}~{}~{}i=1,\dots,m (22)
i=1mAisyi=bs\displaystyle~{}\sum_{i=1}^{m}A^{s}_{i}y_{i}=b^{s} (23)
yi𝒴isi=1,,m\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i}~{}~{}~{}i=1,\dots,m (24)
Bsz𝒟s\displaystyle~{}B^{s}z\in\mathcal{D}^{s} (25)

Here f:nf:\mathbb{R}^{n}\to\mathbb{R} is a continuous function that represents the cost of the first-stage decision xnx\in\mathbb{R}^{n} and 𝒳n\mathcal{X}\subset\mathbb{R}^{n} is a closed convex set. The random vector ξ\xi comprises the random data of the second-stage problem. In the second-stage problem, we would like to minimize the sum of mm cost functions gi:l×pg_{i}:\mathbb{R}^{l}\times\mathbb{R}^{p}\to\mathbb{R} for i=1,,mi=1,\dots,m that depend on two second-stage decision variables: local decision variables yily_{i}\in\mathbb{R}^{l} for i=1,,mi=1,\dots,m and the common decision variable zpz\in\mathbb{R}^{p}. The decision variables yily_{i}\in\mathbb{R}^{l} are local for every i=1,,mi=1,\dots,m, and the local constraints are represented as a closed convex set 𝒴isl\mathcal{Y}^{s}_{i}\subset\mathbb{R}^{l}. The decision variable zpz\in\mathbb{R}^{p} is common for all ii and needs global information to be calculated. The matrix BsB^{s} is of size d×pd\times p and the set 𝒟sd\mathcal{D}^{s}\subset\mathbb{R}^{d} is a closed convex set. Note that the constraints (22) linking the first-stage decision variable xx and the local second-stage decision variables yiy_{i} are defined for every ii separately, where matrices Tisk×nT^{s}_{i}\in\mathbb{R}^{k\times n}, Wisk×lW^{s}_{i}\in\mathbb{R}^{k\times l} and hiskh^{s}_{i}\in\mathbb{R}^{k} depend on the scenario ss. The constraint (23) is a coupling constraint that links the local decision variables yiy_{i}, where Aisd×lA^{s}_{i}\in\mathbb{R}^{d\times l} and bsdb^{s}\in\mathbb{R}^{d} depend on the scenario sSs\in S.

We define the total cost as the aggregation of the individual cost functions gig_{i} using some scalarization vector c+mc\in\mathbb{R}^{m}_{+} such that i=1mci=1\sum_{i=1}^{m}c_{i}=1 and we would like to develop a numerical method to solve the two-stage problem in a distributed way. Specifically, we use decomposition ideas based on the risk-averse multicut method proposed in [15] and the multi-cut methods in risk-neutral stochastic programming to solve the two-stage problem, but we also decompose the second-stage problem into mm subproblems that can be solved independently in order to allow for a distributed operation of mm units (agents).

First, we discuss how to apply the decomposition method to solve the two-stage problem. We use the multicut method to construct a piecewise linear approximation of the optimal value of the second-stage problem and we approximate the measure of risk by subgradient inequalities based on the dual representation of coherent risk measures ϱ[Q]=supμ𝒜ϱμ,Q\varrho[Q]=\sup_{\mu\in\mathcal{A}_{\varrho}}\langle\mu,Q\rangle. To this end, we introduce auxiliary variable η\eta\in\mathbb{R}, which will contain the lower approximation of the measure of risk. Further, we designate QQ the random variable with realizations qsq^{s} which represent the lower approximations of the function Qs(,ξs).Q^{s}(\cdot,\xi^{s}). Then the master problem in our method takes on the following form:

minx,η,q\displaystyle\min_{x,\eta,q}{} f(x)+η\displaystyle~{}f(x)+\eta (26)
s.t. ημτ,Q,τ=1,,t1\displaystyle~{}\eta\geq\langle\mu^{\tau},Q\rangle,~{}~{}~{}\tau=1,\dots,t-1
qsq^s,τ+gs,τ,xxτ,τ=1,,t1,s=1,,S\displaystyle~{}q^{s}\geq\hat{q}^{s,\tau}+\langle g^{s,\tau},x-x^{\tau}\rangle,~{}~{}~{}\tau=1,\dots,t-1,~{}s=1,\dots,S
x𝒳.\displaystyle~{}x\in\mathcal{X}.

The optimal value η^t\hat{\eta}^{t} contains the value of the approximation of ϱ[Q(x^t;ξ)],\varrho[Q(\hat{x}^{t};\xi)], where x^t\hat{x}^{t} is the solution of the master problem at iteration tt. Notice that the approximation ϱt[Q],\varrho^{t}[Q], of ϱ[Q(x^t;ξ)],\varrho[Q(\hat{x}^{t};\xi)], is given by

η^t=ϱt[Q]=max0τt1Q,μτ\hat{\eta}^{t}=\varrho^{t}[Q]=\max_{0\leq\tau\leq t-1}\langle Q,\mu^{\tau}\rangle

with μτ\mu^{\tau} being the probability measures from 𝒜ϱ\mathcal{A}_{\varrho} calculated as subgradients in the previous iterations. We shall explain how the subgradients μτ\mu^{\tau} are obtained in due course. The value q^s,τ\hat{q}^{s,\tau} is the optimal value of the second-stage problem in scenario ss at iteration τ\tau and gs,τg^{s,\tau} is the subgradient calculated using the optimal dual variables of the constraints (22)\eqref{p:dynamics}. One can solve the second-stage problem where the objective function consists of a scalarization of mm cost functions, but we would like to decompose the second-stage problem into mm subproblems QisQ^{s}_{i} that can be solved independently in a distributed manner.

Consider the second-stage problem Qs(x;ξs)Q^{s}(x;\xi^{s}) for a fixed first-stage decision variable xnx\in\mathbb{R}^{n}. To decompose the global problem into mm local subproblems, we need to handle two problems: (i) distribute the common decision variable zpz\in\mathbb{R}^{p} to individual subproblems ii; (ii) decompose the coupling constraints. The common decision variable zz can be distributed to subproblems by creating its copy ziz_{i} for every ii, where i=1,,mi=1,\dots,m. Then we ensure the uniqueness of zz by enforcing the decision variables ziz_{i} to be equal to each other. Then the second-stage problem Q(x;ξ)Q(x;\xi) can be rewritten as:

Qs(x;ξs)=min𝐲,𝐳\displaystyle Q^{s}(x;\xi^{s})=\min_{\mathbf{y},\mathbf{z}}~{} i=1mcigis(yi,zi)\displaystyle~{}\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i}) (27)
s.t. Tisx+Wisyi=his,i=1,,m\displaystyle~{}T^{s}_{i}x+W^{s}_{i}y_{i}=h^{s}_{i},~{}~{}~{}i=1,\dots,m (28)
i=1mAisyi=bs\displaystyle~{}\sum_{i=1}^{m}A^{s}_{i}y_{i}=b^{s} (29)
zi=zji,j=1,,m,\displaystyle~{}z_{i}=z_{j}~{}~{}~{}i,j=1,\dots,m, (30)
yi𝒴is,Bszi𝒟si=1,,m\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i},~{}B^{s}z_{i}\in\mathcal{D}^{s}~{}~{}~{}i=1,\dots,m (31)

In order to distribute the coupling constraints (29), (30), we can apply Lagrange relaxation using Lagrange multipliers λsd\lambda^{s}\in\mathbb{R}^{d} and μsm×m\mu^{s}\in\mathbb{R}^{m\times m}. Then the global augmented Lagrangian problem Λκ0s\Lambda^{s}_{\kappa_{0}} associated with the second-stage problem is defined as:

Λκ0s(𝐲,𝐳)\displaystyle\Lambda^{s}_{\kappa_{0}}(\mathbf{y},\mathbf{z}) =i=1mcigis(yi,zi)+λs,i=1mAisyibs+i=1mj=1mμijs(zizj)+κ02i=1mAisyibs2\displaystyle=\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda^{s},\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\rangle+\sum_{i=1}^{m}\sum_{j=1}^{m}\mu^{s}_{ij}(z_{i}-z_{j})+\frac{\kappa_{0}}{2}\bigg{\|}\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\bigg{\|}^{2}
+κ02i=1mj=1jim(zizj)2\displaystyle+\frac{\kappa_{0}}{2}\sum_{i=1}^{m}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-z_{j})^{2}
=i=1mcigis(yi,zi)+i=1mλs,Aisyi+i=1mj=1m(μijsμjis)ziλs,bs+κ02i=1mAisyibs2\displaystyle=\sum_{i=1}^{m}c_{i}g^{s}_{i}(y_{i},z_{i})+\sum_{i=1}^{m}\langle\lambda^{s},A^{s}_{i}y_{i}\rangle+\sum_{i=1}^{m}\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i}-\langle\lambda^{s},b^{s}\rangle+\frac{\kappa_{0}}{2}\bigg{\|}\sum_{i=1}^{m}A^{s}_{i}y_{i}-b^{s}\bigg{\|}^{2}
+κ02i=1mj=1jim(zizj)2\displaystyle+\frac{\kappa_{0}}{2}\sum_{i=1}^{m}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-z_{j})^{2}

where κ0>0\kappa_{0}>0 is a penalty coefficient. This problem can be decomposed into subproblems with a local augmented Lagrangian Λκ0s,i\Lambda_{\kappa_{0}}^{s,i} defined as:

Λκ0s,i(yi,y~,zi,z~,λ,μ)\displaystyle\Lambda^{s,i}_{\kappa_{0}}(y_{i},\tilde{y},z_{i},\tilde{z},\lambda,\mu) =cigis(yi,zi)+λs,Aisyi+j=1m(μijsμjis)zi\displaystyle=c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda^{s},A^{s}_{i}y_{i}\rangle+\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i}
+κ02Aisyi+j=1jimAjsy~jbs2+κ02(2j=1jim(ziz~j)2+kij=1jk,im(z~kz~j)2)\displaystyle+\frac{\kappa_{0}}{2}\bigg{\|}A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A^{s}_{j}\tilde{y}_{j}-b^{s}\bigg{\|}^{2}+\frac{\kappa_{0}}{2}\bigg{(}2\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}+\sum_{k\neq i}\sum_{\begin{subarray}{c}j=1\\ j\neq k,i\end{subarray}}^{m}(\tilde{z}_{k}-\tilde{z}_{j})^{2}\bigg{)}

where (yi,zi)(y_{i},z_{i}) are decision variables of subproblem ii, y~\tilde{y} and z~\tilde{z} contain given optimal decisions of other subproblems j=1,,m,jij=1,\dots,m,~{}j\neq i. Note that the first penalty term can be expanded as:

Aisyi+j=1jimAjsy~jbs2=k=1d([Aisyi]k+j=1jim[Ajsy~j]kbks)2\displaystyle\|A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A_{j}^{s}\tilde{y}_{j}-b^{s}\|^{2}=\sum_{k=1}^{d}\bigg{(}[A^{s}_{i}y_{i}]_{k}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}[A_{j}^{s}\tilde{y}_{j}]_{k}-b_{k}^{s}\bigg{)}^{2}

This implies that for kk such that [Aisyi]k=0[A^{s}_{i}y_{i}]_{k}=0, the remaining terms are constant with respect to ii and can be omitted from the optimization problem. Hence, the subproblem ii needs access to the decisions of j=1,,m,jij=1,\dots,m,~{}j\neq i which are coupled with ii in constraint kk. Similarly, consider the second penalty term:

2j=1jim(ziz~j)2+kij=1jk,im(z~kz~j)2\displaystyle 2\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}+\sum_{k\neq i}\sum_{\begin{subarray}{c}j=1\\ j\neq k,i\end{subarray}}^{m}(\tilde{z}_{k}-\tilde{z}_{j})^{2}

The terms contained in the last summation that do not include ziz_{i} are constants and can be excluded from the optimization problem. Hence, we can define the subproblem for every ii as follows:

Qis(x,y~,z~,λ,μ,ξs)=minyi,zi\displaystyle Q^{s}_{i}(x,\tilde{y},\tilde{z},\lambda,\mu,\xi^{s})=\min_{y_{i},z_{i}}~{} cigis(yi,zi)+λ,Aisyi+j=1m(μijsμjis)zi\displaystyle~{}c_{i}g^{s}_{i}(y_{i},z_{i})+\langle\lambda,A^{s}_{i}y_{i}\rangle+\sum_{j=1}^{m}(\mu^{s}_{ij}-\mu^{s}_{ji})z_{i} (32)
+κ02Aisyi+j=1jimAjsy~jbs2+κ0j=1jim(ziz~j)2\displaystyle~{}+\frac{\kappa_{0}}{2}\|A^{s}_{i}y_{i}+\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}A_{j}^{s}\tilde{y}_{j}-b^{s}\|^{2}+\kappa_{0}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{m}(z_{i}-\tilde{z}_{j})^{2}
s.t. Tisx+Wisyi=his\displaystyle~{}T^{s}_{i}x+W^{s}_{i}y_{i}=h^{s}_{i} (33)
yi𝒴is,Bszi𝒟is\displaystyle~{}y_{i}\in\mathcal{Y}^{s}_{i},\;~{}B^{s}z_{i}\in\mathcal{D}^{s}_{i} (34)

For a fixed xx, in every scenario sSs\in S, we can implement the Accelerated Distributed Augmented Lagrangian (ADAL) method, we refer to [6] for detailed analysis of the method. The method consists of three main steps: (i) we solve every subproblem QisQ^{s}_{i} to find the optimal primal variables (y^is,z^is)(\hat{y}_{i}^{s},\hat{z}_{i}^{s}); (ii) update the primal variables and check if the coupling constraints (29), (30) are satisfied; (iii) if the constraints are not satisfied, update the dual variables and go back to step (i). The ADAL method converges to the optimal solution (y^s,z^s,λ^s,μ^s)(\hat{y}^{s},\hat{z}^{s},\hat{\lambda}^{s},\hat{\mu}^{s}) in a finitely many steps and we can calculate the optimal value of the objective function Q^is\hat{Q}^{s}_{i} for every subproblem ii. Then the global objective function of the second-stage problem can be calculated as Q^s(x;ξs)=i=1mQ^is(x;ξs)λ^s,bs\hat{Q}^{s}(x;\xi^{s})=\sum_{i=1}^{m}\hat{Q}^{s}_{i}(x;\xi^{s})-\langle\hat{\lambda}^{s},b^{s}\rangle.

Once the second-stage problem is solved for every scenario ss, we construct objective cuts for every scenario sSs\in S defined as:

Qs(x;ξs)Q^s(xk;ξs)+gs,k,xxkQ^{s}(x;\xi^{s})\geq\hat{Q}^{s}(x^{k};\xi^{s})+\langle g^{s,k},x-x^{k}\rangle

where gs,kg^{s,k} is the subgradient of Qs(x;ξs)Q^{s}(x;\xi^{s}) at x=xkx=x^{k} and scenario sSs\in S. Now note that

Qs(x;ξs)=[i=1mQis(x;ξs)λ,bs]=i=1mQis(x;ξs)\partial Q^{s}(x;\xi^{s})=\partial\bigg{[}\sum_{i=1}^{m}Q^{s}_{i}(x;\xi^{s})-\langle\lambda,b^{s}\rangle\bigg{]}=\sum_{i=1}^{m}\partial Q^{s}_{i}(x;\xi^{s})

Hence, at x=xkx=x^{k}, the subgradient for scenario sSs\in S can be calculated as Qs(xk;ξs)=i=1mQis(xk;ξs)\partial Q^{s}(x^{k};\xi^{s})=\sum_{i=1}^{m}\partial Q^{s}_{i}(x^{k};\xi^{s}). The subgradient Qis(xk;ξs)\partial Q_{i}^{s}(x^{k};\xi^{s}) is given as (Tis)πis-(T^{s}_{i})^{\top}\pi^{s}_{i}, where πis\pi^{s}_{i} is the Lagrange multiplier associated with the constraint (28) in subproblem ii. Then the proposed method for solving the two-stage problem is formulated as follows:

  • Step 0. Set t=1t=1 and define initial μ0𝒜ϱ\mu^{0}\in\mathcal{A}_{\varrho}.

  • Step 1. Solve the master problem (26) and denote its optimal solution as (xt,ηt,qt)(x^{t},\eta^{t},q^{t}).

  • Step 2. For every scenario s=1,,Ss=1,\dots,S apply the following method.

    • (a) Set l=1l=1 and define initial Lagrange multipliers λs,1\lambda^{s,1}, μs,1\mu^{s,1} and initial primal variables ys,1,zs,1y^{s,1},z^{s,1}.

    • (b) Given the Lagrange multipliers λs,1,μs,1\lambda^{s,1},\mu^{s,1} and decision variables of the neighboring nodes ys,l,zs,ly^{s,l},z^{s,l}, every node ii calculates its optimal solution (y^is,l,z^is,l)(\hat{y}_{i}^{s,l},\hat{z}_{i}^{s,l}) by solving its local problem:

      minyis,zis\displaystyle\min_{y_{i}^{s},z_{i}^{s}}{} Λκ0s,i(yis,ys,l,zis,zs,l,λs,l,μs,l)\displaystyle~{}\Lambda^{s,i}_{\kappa_{0}}(y_{i}^{s},y^{s,l},z_{i}^{s},z^{s,l},\lambda^{s,l},\mu^{s,l}) (35)
      s.t. yis𝒴is,Bszis𝒟is\displaystyle~{}y_{i}^{s}\in\mathcal{Y}^{s}_{i},~{}B^{s}z_{i}^{s}\in\mathcal{D}^{s}_{i}
    • (c) Every node ii updates its primal variables:

      yis,l+1=yis,l+κs(y^is,lyis,l)y_{i}^{s,l+1}=y_{i}^{s,l}+\kappa_{s}(\hat{y}_{i}^{s,l}-y_{i}^{s,l})
      zis,l+1=zis,l+κs(z^is,lzis,l)z_{i}^{s,l+1}=z_{i}^{s,l}+\kappa_{s}(\hat{z}_{i}^{s,l}-z_{i}^{s,l})
    • (d) If the constraints

      i=1mAisyis=𝐛s,zis=zjs,i,j=1,,m\begin{gathered}\sum_{i=1}^{m}A_{i}^{s}y_{i}^{s}=\mathbf{b}^{s},\quad z_{i}^{s}=z_{j}^{s},~{}i,j=1,\dots,m\end{gathered} (36)

      are satisfied, then calculate the following quantities and go to Step 3:

      gs,t=i=1mgis,l=i=1m(Tis)πis,l\displaystyle g^{s,t}=\sum_{i=1}^{m}g_{i}^{s,l}=\sum_{i=1}^{m}(-T_{i}^{s})^{\top}\pi_{i}^{s,l}
      L^s,t=i=1mΛ^κ0s,i,li=1mλis,lbis\displaystyle\hat{L}^{s,t}=\sum_{i=1}^{m}\hat{\Lambda}^{s,i,l}_{\kappa_{0}}-\sum_{i=1}^{m}\lambda_{i}^{s,l}b_{i}^{s}

      where πis,l\pi_{i}^{s,l} is the optimal Lagrange multiplier associated with the constraint (33) in subproblem ii and Λ^κ0s,i,l\hat{\Lambda}^{s,i,l}_{\kappa_{0}} is the optimal value of the objective function (35).

      If any of the constraints (36) are not satisfied, update their Lagrange multipliers as follows:

      λs,l+1=λs,l+κ0sκs(i=1mAisyis,lbs)\displaystyle\lambda^{s,l+1}=\lambda^{s,l}+\kappa_{0}^{s}\kappa_{s}\bigg{(}\sum_{i=1}^{m}A_{i}^{s}y_{i}^{s,l}-b^{s}\bigg{)}
      μi,js,l+1=μi,js,l+κ0sκs(zis,lzjs,l)\displaystyle\mu_{i,j}^{s,l+1}=\mu_{i,j}^{s,l}+\kappa_{0}^{s}\kappa_{s}(z_{i}^{s,l}-z_{j}^{s,l})

      Increase ll by one and return to Step (b).

  • Step 3. Calculate ϱt=ϱ[L^t]\varrho^{t}=\varrho[\hat{L}^{t}] and μtρ[L^t]\mu^{t}\in\partial\rho[\hat{L}^{t}].

  • Step 4. If ϱt=ηt\varrho^{t}=\eta^{t}, stop; otherwise, increase tt by one and go to Step 1.

Note that the penalty parameter κ0s\kappa^{s}_{0} can be chosen for every scenario sSs\in S according to the structure of the problem. The ADAL method converges to the optimal solution in scenario ss if the penalty parameter κ0s(0,1qs)\kappa^{s}_{0}\in(0,\frac{1}{q^{s}}), where qq is the maximum number of nonzero rows in matrices AisA^{s}_{i} for i=1,,mi=1,\dots,m. Hence, κ0s\kappa^{s}_{0} can be chosen close to 1qs.\frac{1}{q^{s}}.

5.2 Two-stage wireless information exchange problem

In this section, we formulate a two-stage information exchange problem and implement the proposed numerical method to solve it. Consider a problem in a wireless communication network consisting of JJ robots. We denote the team of all robots by 𝒥\mathcal{J}. The robots collect information about the unknown environment and send the information to a set 𝒦={1,,K0}\mathcal{K}=\{1,\dots,K_{0}\} of active reporting points by multi-hop communication. The active reporting points can receive information from robots and store it. The communication links between robots and reporting points are subject to the risk of information loss. Therefore, the objective is to choose the optimal set of active reporting points to minimize the risk associated with the amount of information lost and the proportion of the total information that was gathered but has not reached the reporting points. To this end, we shall formulate a two-stage stochastic programming problem.

The first-stage decision variables are known as the here and now variables. In our problem, these are binary variables zk{0,1}z_{k}\in\{0,1\} for k𝒦k\in\mathcal{K}, where zk=1z_{k}=1 if the kk-th location is selected as an active reporting point is active and zk=0z_{k}=0 otherwise. We assume that at most KK reporting points can be active, where 1K<K0.1\leq K<K_{0}.

Once the reporting points are chosen, the spatial configuration of the robots is observed. We model the uncertainty of the spacial configuration and the amount of information to be observed by a set 𝒮={1,S}\mathcal{S}=\{1,\dots S\} of scenarios. The robots gather information about the environment and either deliver it to the reporting points or exchange it with their neighbors who then deliver it to the active reporting points. The following second-stage decision variables are involved in the second-stage optimization problem. The variables TijsT_{ij}^{s} stand for the amount of information that is sent by node ii to node jj in scenario s𝒮s\in\mathcal{S}. The amount of information observed but not sent by robot ii in scenarios ss is denoted by yisy_{i}^{s}. The proportion of information successfully delivered to the reporting points in scenario ss is denoted by xsx^{s}. Every robot ii generates information risr_{i}^{s} and can send it to its neighbors within some communication range. These communication links between the nodes depend on a function RijsR_{ij}^{s} that calculates what proportion of information sent by node i𝒥i\in\mathcal{J} is received and correctly decoded by node j𝒥𝒦j\in\mathcal{J}\cup\mathcal{K}. Then RijsTijsR_{ij}^{s}T_{ij}^{s} is the amount of information received and correctly decoded by node j𝒥𝒦j\in\mathcal{J}\cup\mathcal{K} in scenario s𝒮s\in\mathcal{S}. Then the set of neighbors of node ii in scenario s𝒮s\in\mathcal{S} can be defined as the set of nodes within its communication range 𝒩s(i)={j𝒥𝒦:Rijs>0}\mathcal{N}^{s}(i)=\{j\in\mathcal{J}\cup\mathcal{K}:R_{ij}^{s}>0\}.

We associate a local risk with each robot about the information that is not communicated to neighbors or delivered to the reporting points because the information might be lost due to damage to the robot or other issues. For every i𝒥i\in\mathcal{J} we define:

yis=ris+j𝒥RjisTjisj𝒥𝒦Tijsy_{i}^{s}=r_{i}^{s}+\sum_{j\in\mathcal{J}}R_{ji}^{s}T_{ji}^{s}-\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}

as the amount of information not communicated to neighbors nor to any of the reporting points by robot i𝒥i\in\mathcal{J}. The systemic risk associated with the team of robots is represented by the total proportion of information not delivered to the reporting points; it is defined as (1xs)(1-x^{s}), where xsx^{s} is calculated as follows:

xsi𝒥ris=i𝒥k𝒦RiksTiks.x^{s}\sum_{i\in\mathcal{J}}r_{i}^{s}=\sum_{i\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{ik}^{s}T_{ik}^{s}. (37)

To implement the distributed method for the operation of the robots, we introduce copies of the total proportion variable for each robot (denoted xisx_{i}^{s}). We then introduce additional constraints to impose equality among the auxiliary variables xisx_{i}^{s}. Constraint (37) is then replaced by the following set of constraints:

i𝒥xisris=j𝒥k𝒦RjksTjks\displaystyle\sum_{i\in\mathcal{J}}x_{i}^{s}r_{i}^{s}=\sum_{j\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{jk}^{s}T_{jk}^{s}
xis=xjsi,j𝒥.\displaystyle x_{i}^{s}=x_{j}^{s}\quad\forall i,j\in\mathcal{J}.

Using these variables, we can express the loss function of every robot i𝒥i\in\mathcal{J} in scenario s𝒮s\in\mathcal{S} as follows

qis=c1yis+c2(1xis),q_{i}^{s}=c_{1}y_{i}^{s}+c_{2}(1-x_{i}^{s}),

where c1>0c_{1}>0 is the weight associated with the local risk, while c2>0c_{2}>0 is the systemic risk. These are modeling parameters. We have used a choice of c1+c2=1c_{1}+c_{2}=1 in-line with our theoretical proposal for aggregating sources of risk. The first-stage optimization problem takes on the following form:

minz\displaystyle\min_{z}~{} ϱ[Q(z;ξ)]\displaystyle~{}\varrho[Q(z;\xi)]
s.t. k𝒦zkK\displaystyle~{}\sum_{k\in\mathcal{K}}z_{k}\leq K
zk{0,1},k𝒦.\displaystyle~{}z_{k}\in\{0,1\},~{}~{}~{}k\in\mathcal{K}.

Here Q(z;ξ)Q(z;\xi) is a random variable with realizations Qs(z;ξs)Q^{s}(z;\xi^{s}) for s=1,,Ss=1,\dots,S denoting the optimal values of the second-stage problem. The second-stage problem deals with the operation of the robots after the location of the reporting points is fixed; it is formulated as follows:

Qs(z;ξs)\displaystyle Q^{s}(z;\xi^{s}) =\displaystyle~{}=
minTs,ys,xs\displaystyle\min_{T^{s},y^{s},x^{s}} i𝒥wiqiss.t.\displaystyle~{}\sum_{i\in\mathcal{J}}w_{i}q_{i}^{s}\quad\text{s.t.} (38)
yis=ris+j𝒥RjisTjisj𝒥𝒦Tijs,i𝒥\displaystyle~{}y_{i}^{s}=r_{i}^{s}+\sum_{j\in\mathcal{J}}R_{ji}^{s}T_{ji}^{s}-\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s},~{}~{}~{}i\in\mathcal{J} (39)
j𝒥Tjis+j𝒥𝒦Tijsa,i𝒥\displaystyle~{}\sum_{j\in\mathcal{J}}T_{ji}^{s}+\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}\leq a,~{}~{}~{}i\in\mathcal{J} (40)
i𝒥xisris=j𝒥k𝒦RjksTjks\displaystyle~{}\sum_{i\in\mathcal{J}}x_{i}^{s}r_{i}^{s}=\sum_{j\in\mathcal{J}}\sum_{k\in\mathcal{K}}R_{jk}^{s}T_{jk}^{s} (41)
xis=xjs,i,j𝒥\displaystyle~{}x_{i}^{s}=x_{j}^{s},~{}~{}~{}i,j\in\mathcal{J} (42)
TiksMzk,i𝒥,k𝒦\displaystyle~{}T_{ik}^{s}\leq Mz_{k},~{}~{}~{}i\in\mathcal{J},k\in\mathcal{K} (43)
Tijs0,i𝒥,j𝒥𝒦\displaystyle~{}T_{ij}^{s}\geq 0,~{}~{}~{}i\in\mathcal{J},j\in\mathcal{J}\cup\mathcal{K} (44)
yis0,i𝒥.\displaystyle~{}y_{i}^{s}\geq 0,~{}~{}~{}i\in\mathcal{J}. (45)

Here M>0M>0 is a large constant, which helps us provide a logical upper bound for communication only to the active reporting points in constraint (43). Additionally, wi>0w_{i}>0 are weights associated with the loss functions of individual robots. Here again, we propose i𝒥wi=1\sum_{i\in\mathcal{J}}w_{i}=1 according to the proposed systemic measures of risk. We notice that the second-stage problems are always feasible for any feasible first-stage decision. Hence, the two-stage problem has a relatively complete recourse. Furthermore, the recourse function Q(z;ξ)Q(z;\xi) has finitely many realizations Qs(z;ξs),Q^{s}(z;\xi^{s}), s=1,,Ss=1,\dots,S for every fixed argument zz. This implies that we can use any coherent measure of risk ϱ[]\varrho[\cdot] for evaluating the risk of Q(,ξ)Q(\cdot,\xi); the value ϱ[Q(z;ξ)]\varrho[Q(z;\xi)] is well defined and finite for all feasible first-stage decisions z.z.

5.3 Numerical results

We solve the problem using the distributed method proposed in 5.1. In the given problem, the decision variables Tijs,yis,xisT_{ij}^{s},y_{i}^{s},x_{i}^{s} are local to every node i𝒥i\in\mathcal{J} and there are four coupling constraints that need to be distributed to the nodes:

  • (1)

    the flow conservation constraints (39);

  • (2)

    the transmission capacity constraints (40);

  • (3)

    the proportion constraint (41);

  • (4)

    the equality constraints enforcing the uniqueness of the proportion (42).

Since the ADAL algorithm operates equality constraints, we can introduce auxiliary variables uisu_{i}^{s} and redefine (40) as j𝒥Tjiss+j𝒥𝒦Tijs+uis=a\sum_{j\in\mathcal{J}}T_{ji}^{s}s+\sum_{j\in\mathcal{J}\cup\mathcal{K}}T_{ij}^{s}+u_{i}^{s}=a. Then the coupling constraints can be rewritten using appropriate matrices and vectors stacking decision variables of every node i𝒥i\in\mathcal{J}. Let vis=[yis,Ti1s,Ti2s,Ti3s,,TiJs,,Ti(J+K0)s,xis,uis]v_{i}^{s}=[y_{i}^{s},T_{i1}^{s},T_{i2}^{s},T_{i3}^{s},\dots,T_{iJ}^{s},\dots,T_{i(J+K_{0})}^{s},x_{i}^{s},u_{i}^{s}]^{\top} be a (J+K0+3)(J+K_{0}+3)-dimensional vector stacking all decision variables of the node i𝒥i\in\mathcal{J}. Then the flow conservation constraint (39) can be rewritten as i𝒥Aisvis=𝐫s\sum_{i\in\mathcal{J}}A_{i}^{s}v_{i}^{s}=\mathbf{r}^{s}, where 𝐫s=[r1s,,rJs]\mathbf{r}^{s}=[r_{1}^{s},\dots,r_{J}^{s}]^{\top} is a JJ-dimensional vector and AisA_{i}^{s} is J×(J+K0+3)J\times(J+K_{0}+3) dimensional matrix defined as:

Ais=[11Riis11110000Ri2s00000000Ri3s00000000RiJs000]A_{i}^{s}=\begin{bmatrix}1&1-R_{ii}^{s}&1&1&\dots&1&\dots&1&0&0\\ 0&0&-R_{i2}^{s}&0&\dots&0&\dots&0&0&0\\ 0&0&0&-R_{i3}^{s}&\dots&0&\dots&0&0&0\\ \vdots&\vdots&\vdots&&\vdots&&\vdots&\vdots\\ 0&0&0&0&\dots&-R_{iJ}^{s}&\dots&0&0&0\\ \end{bmatrix}

where all elements in the ii-th row are equal to 1 except of terms (i,i+1)(i,i+1) and the last two columns which are equal to 0. Note that Riis=1R_{ii}^{s}=1 for all i𝒥i\in\mathcal{J}, hence the terms 1Riis1-R_{ii}^{s} are equal to 0. Similarly, constraints (40) and (41) can be rewritten as

i𝒥Bisvis=a𝟏,i𝒥(Cis)vis=0,\displaystyle\sum_{i\in\mathcal{J}}B_{i}^{s}v_{i}^{s}=a\mathbf{1},\quad\sum_{i\in\mathcal{J}}(C_{i}^{s})^{\top}v_{i}^{s}=0,\quad

using appropriate matrices Bis,CisB_{i}^{s},C_{i}^{s} for every node i𝒥i\in\mathcal{J}, where 𝟏\mathbf{1} is a vector of all ones. Note that since nodes can share information only with the neighbors, one can enforce the equality of the proportion variable between neighboring nodes and rewrite the constraint (42) as follows:

xis=xjs,i𝒥,j𝒩s(i)\displaystyle x_{i}^{s}=x_{j}^{s},~{}~{}~{}i\in\mathcal{J},~{}j\in\mathcal{N}^{s}(i) (46)

where 𝒩s(i)\mathcal{N}^{s}(i) is the set of nodes within communication range of node i𝒥i\in\mathcal{J} in scenario s𝒮s\in\mathcal{S}. If the network is connected, constraint (46) enforces all xix_{i} to be equal to each other and ensures the consistency and uniqueness of xix_{i}.

We assume that the team of robots works on a square map given by the points with relative coordinates (0,0)(0,0) and (2,2)(2,2). The spatial distribution of available information to be gathered follows a normal distribution with an expected value 𝒞=(0.5,1.75)\mathcal{C}=(0.5,1.75) in the upper left corner of the map. The network consists of 50 robots and 4 potential locations of the reporting points. We generated 200 scenarios for different spatial configurations of the robots. The four potential locations for the reporting points are fixed in the positions (0.5,0.3),(1.5,0.25),(1.75,0.5),(1,0.2)(0.5,0.3),(1.5,0.25),(1.75,0.5),(1,0.2). The rate function RijsR_{ij}^{s} depends on the distance between the nodes in the network and is defined as:

Rijs={1,if dijs,adijs3+bdijs2+cdijs+e,if <dijsu,0,if dijs>u,R_{ij}^{s}=\begin{cases}1,&\text{if }\|d_{ij}^{s}\|\leq\ell,\\ a\|d_{ij}^{s}\|^{3}+b\|d_{ij}^{s}\|^{2}+c\|d_{ij}^{s}\|+e,&\text{if }\ell<\|d_{ij}^{s}\|\leq u,\\ 0,&\text{if }\|d_{ij}^{s}\|>u,\end{cases}

where dijs\|d_{ij}^{s}\| is the distance between the nodes in scenario s𝒮s\in\mathcal{S}. We set =0.3\ell=0.3 and u=0.6u=0.6, and values a,b,ca,b,c and ee are chosen so that RijsR_{ij}^{s} is a continuous function. This function is commonly used in literature, see e.g. [32]. The information risr_{i}^{s} gathered by robot ii in scenario ss, depends on the robot’s position relative to the expected value 𝒞\mathcal{C} given above. In our experiments risr_{i}^{s} is calculated as follows:

ris=w2π|Σ|12e12(di𝒞)Σ1(di𝒞),r_{i}^{s}=\frac{w}{2\pi|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(d_{i}-\mathcal{C})\Sigma^{-1}(d_{i}-\mathcal{C})},

where did_{i} is the positions of robot i𝒥,i\in\mathcal{J}, ww is a scaling factor, and Σ\Sigma is a covariance matrix, which keep fixed for all experiments.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 1: Communication network of 50 robots and 4 reporting points in one scenario. The source is located in the upper left corner. The lighter color (yellow) and darker color (purple) indicate higher and lower rates of information generation, respectively. (a) The initial spatial configuration of robots (green) and the reporting points (blue). The lines show communication links and their thickness indicates the rate of connection RijR_{ij} between nodes ii and jj. (b) The optimal routing decisions between nodes when the risk is applied to the total loss of all nodes. Blue nodes are selected and red nodes are not selected. (c) The optimal routing decisions between nodes when the individual risks of the nodes are aggregated. Blue nodes are selected and red nodes are not selected.

Comparison of aggregation methods. We solved the optimization problem using two different aggregation methods:

  • aggregate first Using the proposed multivariate measures of risk, we aggregate the individual losses of the robots with a fixed scalarization ww, we calculate for each scenario Vs=i𝒥wiqisV^{s}=\sum_{i\in\mathcal{J}}w_{i}q_{i}^{s} and evaluate its risk ϱ[V]\varrho[V] by several scalar-valued measures of risk;

  • evaluate first We evaluate the individual risk of every robot across all scenarios and calculate Vi=ϱi[qi]V_{i}=\varrho_{i}[q_{i}]. Then we aggregate their values ϱS[V]\varrho_{S}[V] using two examples of nonlinear aggregation shown in section 4.2.1.

We solve the problem using a linear scalarization vector ww with equal weights wi=1Jw_{i}=\frac{1}{J} for all i𝒥i\in\mathcal{J}, c=[0.8,0.2]c=[0.8,0.2] and AVaRα()\operatorname{AVaR}_{\alpha}(\cdot) for three values of α=0.1,0.2,0.3\alpha=0.1,0.2,0.3.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 2: (a) Proportion of information delivered to the reporting points using Mean-AVaR at α=0.1\alpha=0.1. (b) Proportion of information delivered to the reporting points using Mean-Upper-Semideviation of order 1. (c) Comparison of the risk values for two aggregation methods.

The setup of the communication network problem and the optimal solutions in one of the scenarios for both methods are shown in Fig. 1. One can notice that depending on what kind of aggregation method is used, the set of optimal reporting points might be different. The distribution of the proportion xx of information delivered to the reporting points for two methods is shown in Fig. 2. It can be seen that more information is delivered to the reporting points if we aggregate the losses of robots and evaluate the risk. This observation is also reflected in the values of the risk for both methods: imposing a risk measure on linear scalarization of the individual losses results in smaller values than aggregation of individual risks.

Using the optimal values of the decision variables, we can calculate MAVaR\operatorname{MAVaR} and VMAVaR\operatorname{VMAVaR} to compare their values with the AVaR applied on linear scalarization of the random cost. The following formulas were used to calculate the values:

AVaRα(V)=infη{η+1α𝔼[(wqη)+]}\displaystyle\operatorname{AVaR}_{\alpha}(V)=\inf_{\eta\in\mathbb{R}}\bigg{\{}\eta+\frac{1}{\alpha}\mathbb{E}\Big{[}(w^{\top}q-\eta)_{+}\Big{]}\bigg{\}} (47)
MAVaRα(V)=𝔼[wq|q𝒵1α]\displaystyle\operatorname{MAVaR}_{\alpha}(V)=\mathbb{E}\Big{[}w^{\top}q~{}|~{}q\in\mathcal{Z}_{1-\alpha}\Big{]} (48)
VMAVaRα(V)=infη2{wη+1α𝔼[w(Vη)+]:Pr[Vη]>1α}\displaystyle\operatorname{VMAVaR}_{\alpha}(V)=\inf_{\eta\in\mathbb{R}^{2}}\bigg{\{}w^{\top}\eta+\frac{1}{\alpha}\mathbb{E}\Big{[}w^{\top}(V-\eta)_{+}\Big{]}:\Pr[V\leq\eta]>1-\alpha\bigg{\}} (49)

The values of AVaR\operatorname{AVaR}, MAVaR\operatorname{MAVaR} and VMAVaR\operatorname{VMAVaR} are shown in Table 1. It can be seen that AVaRα(V)\operatorname{AVaR}_{\alpha}(V) results in smaller values than MAVaRα(V)\operatorname{MAVaR}_{\alpha}(V) and VMAVaRα(V)\operatorname{VMAVaR}_{\alpha}(V) at all confidence levels α\alpha as it was shown theoretically in section 4.Those measures of risk are computationally very demanding and not amenable to the type of decision problems, we are considering. Hence, we only compare their values for the decision obtained via our proposed method.

ϱ\varrho α\alpha 0.1 0.2 0.3
AVaRα\operatorname{AVaR}_{\alpha} 0.1429 0.1389 0.1364
MAVaRα\operatorname{MAVaR}_{\alpha} 0.1992 0.1693 0.1634
VMAVaRα\operatorname{VMAVaR}_{\alpha} 0.174 0.1622 0.1553
Table 1: Comparison of AVaR\operatorname{AVaR}, MAVaR\operatorname{MAVaR} and VMAVaR\operatorname{VMAVaR} values for α=0.1,0.2,0.3\alpha=0.1,0.2,0.3.

When we solve the problem in a distributed way, we use a smaller network consisting of 20 robots and 4 reporting points in a 1.5 by 1.5 square over 100 scenarios. It is assumed that the network is connected in all possible scenarios, that is, every node has at least one neighbor within the communication range, and all nodes are connected to the reporting points through multiple hops. This assumption is necessary for the proper calculation of the proportion of information delivered to the reporting points. If one of the nodes is isolated from the network, the rest of the group converges to a solution that does not take into account the isolated node’s contribution. The problem is solved in both centralized and distributed ways, and the results for one of the scenarios are shown in Fig. 3. As it can be seen in Fig. 3 (b), nodes converge to the centralized solution of the proportion of information delivered to the reporting points.

Refer to caption
(a)
Refer to caption
(b)
Figure 3: (a) Evolution of the sum of local losses i𝒥yis\sum_{i\in\mathcal{J}}y_{i}^{s} (blue) and proportion constraint i𝒥(xisrisk𝒦RiksTiks)\sum_{i\in\mathcal{J}}\Big{(}x_{i}^{s}r_{i}^{s}-\sum_{k\in\mathcal{K}}R_{ik}^{s}T_{ik}^{s}\Big{)} (red) in scenario ss. (b) Convergence of the robots’ proportion variables xisx_{i}^{s} to the centralized solution in scenario ss.

6 Conclusions

Our contributions can be summarized as follows. We propose a sound axiomatic approach to measures of risk for distributed systems. We show that several classes of non-trivial measures that satisfy the axioms can be constructed. These measures can be calculated efficiently and are less conservative than most of the other systemic measures of risk. The class of measures proposed in section 3.3 goes beyond the popular ways to evaluate risk of agents then aggregate.

We have devised a distributed method for solving the risk-averse two-stage problems with monotropic structure, which works for any measure of risk not only for those that are representable as expected value. The construction is quite general and could serve as a template for devising other distributed methods for problems with systemic measures of risk.

We demonstrate the viability of the proposed framework on a non-trivial two-stage problem involving wireless communication. The numerical experiments confirm the theoretical observations and show the advantage of the proposed approach to risk aggregation in distributed systems.

In conclusion, the advantage of the new approach is the good balance of robustness to the uncertainty, optimality of the loss functions involved, and the efficiency of the numerical operation.

References

  • [1] Çağın Ararat, and Birgit Rudloff. Dual representations for systemic risk measures. Mathematics and Financial Economics, 14:139-174, 2020.
  • [2] Philippe Artzner, Freddy Delbaen, Jean-Marc Eber, and David Heath. Coherent measures of risk. Mathematical Finance, 9(3):203–228, July 1999.
  • [3] Francesca Biagini, Jean-Pierre Fouque, Marco Frittelli, and Thilo Meyer-Brandis. A Unified Approach to Systemic Risk Measures via Acceptance Sets. 2015.
  • [4] Markus Brunnermeier and Patrick Cheridito. Measuring and Allocating Systemic Risk. Risks, 7(2), 2019.
  • [5] Christian Burgert and Ludger Rüschendorf. Consistent risk measures for portfolio vectors. Insurance: Mathematics and Economics, 38(2):289–297, 2006.
  • [6] Nikolaos Chatzipanagiotis, Darinka Dentcheva, and Michael Zavlanos. An augmented lagrangian method for distributed optimization. Mathematical Programming, 152, 01 2014.
  • [7] Chen Chen, Garud Iyengar, and C. Ciamac Moallemi. An Axiomatic Approach to Systemic Risk. Management Science, 59(6):1373–1388, 2013.
  • [8] Freddy Delbaen. Coherent risk measures on general probability spaces. Advances in Finance and Stochastics, pages 1–37, March 2000.
  • [9] Darinka Dentcheva, Bogumila Lai, and Andrzej Ruszczyński. Dual methods for probabilistic optimization. Mathematical Methods of Operations Research, 60:331–346, 2004.
  • [10] Ivar Ekeland and Walter Schachermayer, Law invariant risk measures on (d)\mathcal{L}^{\infty}(\mathbb{R}^{d}) Statistics & Risk Modeling, 28(3): 195–225, 2011.
  • [11] Kromer Eduard, Overbeck Ludger, and Zilch Katrin. Systemic Risk Measures on General Measurable Spaces. 05 2016.
  • [12] Zachary Feinstein, Birgit Rudloff, and Stefan Weber. Measures of Systemic Risk. SIAM Journal on Financial Mathematics, 8(1):672–708, Jan 2017.
  • [13] Hans Föllmer and Alexander Schied. Convex measures of risk and trading constraints. Finance and Stochastics, 6:429–447, 2002.
  • [14] Hans Föllmer and Alexander Schied. Stochastic Finance: An Introduction in Discrete Time, 3rd Edition. Walter De Gruyter, 2011.
  • [15] Sıtkı Gülten and Andrzej Ruszczyński. Two-stage portfolio optimization with higher-order conditional measures of risk. Annals of Operations Research, 229:409–427, 06 2015.
  • [16] Andreas Hamel, and Frank Heyde, Duality for set-valued measures of risk, SIAM Journal on Financial Mathematics, 1(1):66–95,2010.
  • [17] Elyes Jouini, Moncef Meddeb, and Nizar Touzi, Vector-valued coherent risk measures, Finance and stochastics, 8(4): 531–552, 2004.
  • [18] M. Kijima and M. Ohnishi. Mean-risk analysis of risk aversion and wealth effects on optimal portfolios with multiple investment opportunities. Ann. Oper. Res., 45:147–163, 1993.
  • [19] Jinwook Lee and András Prékopa. Properties and calculation of multivariate risk measures: MVaR and MCVaR. Annals of Operations Research, 211:225–254, 2013.
  • [20] J. Leitner. A short note on second-order stochastic dominance preserving coherent risk measures. Mathematical Finance, 15:649–651, 2005.
  • [21] Ma, Wann-Jiun and Oh, Chanwook and Liu, Yang and Dentcheva, Darinka and Zavlanos, Michael M., Risk-Averse Access Point Selection in Wireless Communication Networks, IEEE Transactions on Control of Network Systems, 6(1): 24–36, 2019.
  • [22] Merve Meraklı and Simge Küçükyavuz. Vector-Valued Multivariate Conditional Value-at-Risk. Operations Research Letters, 46(3):300–305, 2018.
  • [23] Nilay Noyan and Gábor Rudolf. Optimization with multivariate conditional value-at-risk constraints. Operations research, 61(4):990–1013, 2013.
  • [24] Georg Pflug and Alois Pichler, Systemic risk and copula models, Central European Journal of Operations Research, 26: 465–483 2018.
  • [25] G.Ch. Pflug and W. Römisch. Modeling, Measuring and Managing Risk. World Scientific, Singapore, 2007.
  • [26] András Prékopa. Multivariate Value at Risk and Related Topics. Annals of Operations Research, 193:49–69, 2012.
  • [27] R Tyrrell Rockafellar and Stan Uryasev. The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surveys in Operations Research and Management Science, 18(1-2):33–53, 2013.
  • [28] Ludger Rüschendorf. Mathematical Risk Analysis. Springer, 2015.
  • [29] A. Ruszczyński and A. Shapiro. Optimization of risk measures. In G. Calafiore and F. Dabbene, editors, Probabilistic and Randomized Methods for Design under Uncertainty, pp. 117–158, Springer-Verlag, London, 2005.
  • [30] A. Ruszczyński and A. Shapiro. Optimization of convex risk functions. Mathematics of Operations Research, 31:433–452, 2006.
  • [31] Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczyński. Lectures on Stochastic Programming: Modeling and Theory. Society for Industrial & Applied Mathematics (SIAM), 2009.
  • [32] Michael M. Zavlanos, Alejandro Ribeiro, and George J. Pappas. Network integrity in mobile robotic networks. IEEE Transactions on Automatic Control, 58(1):3–18, 2013.