Variance Contracts

Yichun Chi China Institute for Actuarial Science, Central University of Finance and Economics, Beijing 102206, China. Email: yichun@cufe.edu.cn. Xun Yu Zhou Department of Industrial Engineering and Operations Research and The Data Science Institute, Columbia University, New York, NY 10027, USA. Email: xz2574@columbia.edu. Sheng Chao Zhuang Department of Finance, University of Nebraska-Lincoln, NE, USA. Email: szhuang3@unl.edu.

Abstract

We study the design of an optimal insurance contract in which the insured maximizes her expected utility and the insurer limits the variance of his risk exposure while maintaining the principle of indemnity and charging the premium according to the expected value principle. We derive the optimal policy semi-analytically, which is coinsurance above a deductible when the variance bound is binding. This policy automatically satisfies the incentive-compatible condition, which is crucial to rule out ex post moral hazard. We also find that the deductible is absent if and only if the contract pricing is actuarially fair. Focusing on the actuarially fair case, we carry out comparative statics on the effects of the insured’s initial wealth and the variance bound on insurance demand. Our results indicate that the expected coverage is always larger for a wealthier insured, implying that the underlying insurance is a normal good, which supports certain recent empirical findings. Moreover, as the variance constraint tightens, the insured who is prudent cedes less losses, while the insurer is exposed to less tail risk.

Key-words: Insurance design; expected value principle; variance; incentive compatibility; comparative statics.

1 Introduction

Insurance is an efficient mechanism to facilitate risk reallocation between two parties. Borch (1960) was the first to study the insurance contract design problem and to prove that given a fixed premium, a stop-loss (or deductible) insurance policy (i.e., full coverage above a deductible) achieves the smallest variance of the insured’s share of payment.¹¹1Borch (1960) presents the problem in a reinsurance setting, in which the ceding insurer corresponds to the insured in an insurance setting. Arrow (1963) assumes that the premium is calculated by the expected value principle (i.e., the insurance cost is proportional to the expected indemnity) and imposes the principle of indemnity (i.e., the insurer’s reimbursement is non-negative and smaller than the loss). Under these specifics, Arrow (1963) shows that the stop-loss insurance is Pareto optimal between a risk-neutral insurer and a risk-averse insured. This is a foundational result that has earned the name of Arrow’s theorem of the deductible in the literature. Mossin (1968) further proves that the deductible is strictly positive if and only if the insurance price is actuarially unfair (i.e., the safety loading is strictly positive).

In Arrow (1963), the insurer is assumed to be risk-neutral. This is based on the assumption that the insurer has a sufficiently large number of independent and homogenous insureds, such that his risk, by the law of large numbers, is sufficiently diversified to be nearly zero. This kind of theoretically ideal situation hardly occurs in practice, even if the insurer does indeed have a huge number of clients. Moreover, it does not apply to tailor-made contracts for insuring one-off events (e.g., the shipment of a highly valuable painting). Theorem 2 in Arrow (1971) stipulates that, when the insurer is risk-averse and insurance cost is absent, an optimal contract must involve coinsurance. Raviv (1979) extends this result to include nonlinear insurance costs and shows that an optimal policy involves both a deductible and coinsurance. Much of the recent research in this area has focused on the insurer’s tail risk exposure. Cummins and Mahul (2004) and Zhou et al. (2010) extend Arrow’s model by introducing an exogenous upper bound on indemnity and thereby limiting the insurer’s liability with respect to catastrophic losses. From a regulatory perspective, Zhou and Wu (2008) propose a model in which the insurer’s expected loss above a prescribed level is controlled, and they conclude that an optimal policy is generally piecewise linear. Doherty et al. (2015) investigate the case in which losses are nonverifiable and deduce that a contract with a deductible and an endogenous upper limit is optimal.

As important as tail risks are for both parties in insurance contracts, in practice insurers are also concerned with other parts of the loss distribution. Kaye (2005), in a Casualty Actuarial Society report, writes

“Different stakeholders have different levels of interest in different parts of the distribution - the perspective of the decision-maker is important. Regulators and rating agencies will be focused on the extreme downside where the very existence of the company is in doubt. On the other hand, management and investors will have a greater interest in more near-term scenarios towards the middle of the distribution and will focus on the likelihood of making a profit as well as a loss” (p. 4).

Assuming the insurer to be risk-averse with a concave utility function indeed takes the whole risk distribution into consideration, as studied in Arrow (1971) and Raviv (1979). However, there are notable drawbacks to the utility function approach. The notion of utility is opaque for many non-specialists, and the benefit–risk tradeoff is only implied, implicitly, through a utility function. Moreover, one can rarely obtain, analytically, optimal policies with a general utility, hindering post-optimality analyses such as comparative statics. For instance, Raviv (1979) derives a differential equation satisfied by the optimal indemnity, which takes a rather complex form depending on the utility function used.

By contrast, variance, as a measure of risk originally put forth in Markowitz’s pioneering work (Markowitz, 1952), is also related to the whole distribution, yet it is more intuitive and transparent. Borch (1960) designs a contract that aims to minimize the variance of the insured’s liability. Kaluszka (2001) extends Borch (1960)’s work by incorporating a variance-related premium principle and shows that the optimal contract to minimize the variance of the insured’s payment can be stop-loss, quota share (i.e., the insurer covers a constant proportion of the loss) or a combination of the two. Vajda (1962) studies the problem from the insurer’s perspective, and shows that a quota share policy minimizes the variance of indemnity in an actuarially fair contract. However, his result depends critically on limiting the admissible contracts to be such that the ratio between indemnity and loss increases as the loss increases, a feature that enables the derivation of a solution through rather simple calculus.²²2Vajda (1962) claims that this feature “agrees with the spirit of (re)insurance, at least in most cases” (p. 259). However, for a larger loss, it is indeed in the spirit of insurance that the insurer should pay more, but it is not clear why he should be responsible for a higher proportion. Interestingly, our results will show that the optimal policies of our model possess this property if the insured is prudent; see Corollary 3.9.

In this paper, we revisit the work of Arrow (1963) by imposing a variance constraint on the insurer’s risk exposure. Unlike Vajda (1962), we consider the general actuarially unfair case and remove the restriction that the proportion of the insurer’s payment increases with the size of the loss. The presence of the variance constraint causes substantial technical challenges in solving the problem. In the literature, there are generally two approaches used to study variants of Arrow’s model: those involving sample-wise optimization and stochastic orders. However, the former fails to work for our problem due to the nonlinearity of the variance constraint, and the latter is not readily applicable either because the presence of the variance constraint invalidates the claim that any admissible contract is dominated by a stop-loss one. The first contribution of this paper is methodological: we develop a new approach by combining the techniques of stochastic orders, calculus of variations and the Lagrangian duality to derive optimal insurance policies. The solutions are semi-analytical in the sense that they can be computed by solving some algebraic equations (as opposed to differential equations in Raviv 1979).

Because the expected value premium principle ensures the expected profit of the insurer, our model is essentially a mean–variance model à la Markowitz for the insurer. Our second contribution is actuarial: we show that the optimal contract is coinsurance above a deductible when the variance constraint is binding. Moreover, the deductible disappears if and only if the insurance price is actuarially fair, consistent with Mossin’s Theorem (Mossin, 1968). These results are qualitatively similar to those of Raviv (1979), who uses a concave utility function for the insurer. A natural question is why one would bother to study the mean–variance version of a problem that would generate contracts with similar characteristics to its expected utility counterpart. This question can be answered in the same way as in the field of financial portfolio selection, where there is an enormously large body of study on the Markowitz mean–variance model along with its popularity in practice, despite the existence of the equally well-studied expected utility maximization models. In other words, expected utility and mean–variance are two different frameworks, and, as argued earlier, the latter underlines a more transparent and explicit return–risk tradeoff, which usually leads to explicit solutions.

Our optimal policies involve coinsurance, which is widely utilized in the insurance industry. As pointed out by Raviv (1979), risk aversion on the part of the insurer could be a cause for coinsurance, but other attributes such as the nonlinearity of the insurance cost function could also lead to coinsurance. Another explanation for coinsurance is to mitigate the moral hazard risk; see Hölmstrom (1979) and Dionne and St-Michel (1991). From the insured’s perspective, Doherty and Schlesinger (1990) argue that default risk of the insurer can motivate the insured to choose coinsurance. Picard (2000) also shows that coinsurance is optimal in order to reduce the risk premium paid to the auditor. In this paper, we prove that optimal policies can turn from full insurance to coinsurance as the variance bound tightens, thereby providing a novel yet simple reason for the prevalent feature of coinsurance in insurance theory and practice: a variance bound on the insurer’s risk exposure.

Intriguingly, our optimal insurance polices automatically satisfy the so-called incentive-compatible condition that both the insured and the insurer pay more for a larger loss (or, equivalently, the marginal indemnity is between 0 and 1).³³3The incentive-compatible condition is termed the no-sabotage condition in Carlier and Dana (2003). In Arrow (1963)’s setting, the optimal contract – the stop-loss one – turns out to be incentive-compatible; however, this is generally untrue. Gollier (1996) considers an insured facing an additional background risk that is not insurable. Under the expected value principle, he discovers that the optimal insurance, which relies heavily on the dependence between the background risk and the loss, may render the marginal indemnity strictly larger than 1. Bernard et al. (2015) generalize the insured’s risk preference from expected utility to rank-dependent utility involving probability distortion (weighting), and also find that the optimal indemnity may decrease when the loss increases. In both of these papers, the derived optimal contracts would incentivize the insured to misreport the actual losses, leading to ex post moral hazard. Equally absurd would be the case in which the insurer pays less for a larger loss. To address this issue, Huberman et al. (1983) propose the incentive-compatible condition as a hard constraint on admissible insurance policies, in addition to the principle of indemnity. Xu et al. (2019) add this constraint to the model of Bernard et al. (2015), painstakingly developing a completely different approach in order to overcome the difficulty arising out of this additional constraint and deriving qualitatively very different contracts. On the other hand, Raviv (1979) discovers that his optimal solution is incentive-compatible, assuming that the loss has a strictly positive probability density function. Carlier and Dana (2005) use a Hardy–Littlewood rearrangement argument to prove that any optimal contract is dominated by an incentive-compatible contract, establishing the optimality of the latter. However, their approach relies heavily on the assumption that the loss is non-atomic. Both of these studies rule out the important and practical case in which the loss is atomic at $0$ . By contrast, in the presence of the variance constraint, we show that the optimal policy is naturally incentive-compatible even without the corresponding hard constraint or the assumption of an atom-less loss.

Our final contribution is a comparative statics analysis examining the respective effects of the insured’s wealth and the variance constraint on insurance demand under actuarially fair pricing. Our results indicate that the presence of a variance bound fundamentally changes the insurance strategy – it makes the insured’s wealth relevant and it changes the way in which the two parties share the risk. In particular, the expected coverage is always larger for a wealthier insured who has strictly decreasing absolute prudence (DAP), rendering the insurance product a normal good. This finding provides some theoretical foundation for the empirical observations of Millo (2016) and Armantier et al. (2018). Moreover, we show that the insurer has less downside risk when contracting with a wealthier insured with strictly DAP.⁴⁴4Menezes et al. (1980) introduce the notion of downside risk to compare two risks with the same mean and variance. The formal definition is given in Section 2.2. This result reconciles with the well-documented phenomenon that more economically advanced regions or countries have higher insurance densities and penetrations. On the other hand, we establish that the variance bound significantly changes a prudent insured’s risk transfer decision – she would consistently transfer more losses as the variance bound loosens. A corollary of this result is, rather surprisingly, that the insurer can reduce the tail risk by simply tightening the variance constraint. This suggests that our variance contracts do, after all, address the issue of tail exposure.

The rest of the paper proceeds as follows. In Section 2 we formulate the problem and present some preliminaries about risk preferences. In Section 3 we develop the solution approach and present the optimal insurance contracts. In Section 4 we conduct a comparative analysis by examining the effects of the insured’s initial wealth and the variance constraint on insurance demand. Section 5 concludes the paper. Some auxiliary results and all proofs are relegated to the appendices.

2 Problem Formulation and Preliminaries

2.1 Variance contracts formulation

An insured endowed with an initial wealth $w_{0}$ faces an insurable loss $X$ , which is a non-negative, essentially bounded random variable defined on a probability space $(\Omega,\mathscr{F},{\mathbb{P}})$ with the cumulative distribution function (c.d.f.) $F_{X}(x):={\mathbb{P}}\{X\leqslant x\}$ and the essential supremum $\mathcal{M}<\infty$ . An insurance contract design problem is to partition $X$ into two parts, $I(X)$ and $X-I(X)$ , where $I(X)$ (the indemnity) is the portion of the loss that is ceded to the insurer (“he”) and $R_{I}(X):=X-I(X)$ (the retention) is the portion borne by the insured (“she”). $I$ and $R_{I}$ are also called the insured’s ceded and retained loss functions, respectively. It is natural to require a contract to satisfy the principle of indemnity, namely the indemnity is non-negative and less than the amount of loss. Thus, the feasible set of indemnity functions is

\mathfrak{C}:=\left\{I:0\leqslant I(x)\leqslant x,\,\forall x\in[0,\mathcal{M}]\right\}.

As the insurer covers part of the loss for the insured, he is compensated by collecting the premium from her. Following many studies in the literature, we assume that the insurer calculates the premium using the expected value principle. Specifically, the premium on making a non-negative random payment $Y$ is charged as

\pi(Y)=(1+\rho){\mathbb{E}}[Y]

where $\rho\geqslant 0$ is the so-called safety loading coefficient. His risk exposure under a contract $I$ for a loss $X$ is hence

e_{I}(X)=I(X)-\pi(I(X)).

The insurer may evaluate this risk using different measures for different purposes, as Kaye (2005) notes. In this paper, we assume that the insurer has sufficient regulatory capital and therefore focuses on the volatility of the underwriting risk. Specifically, he uses the variance to measure the risk and requires

\displaystyle var[e_{I}(X)]\equiv var[I(X)]\leqslant\nu

for some prescribed $\nu>0$ .

On the other hand, denote by $W_{I}(X)$ the insured’s final wealth under contract $I$ upon its expiration, namely

W_{I}(X)=w_{0}-X+I(X)-\pi(I(X)).

The insured’s risk preference is characterized by a von Neumann–Morgenstern utility function $U$ satisfying $U^{\prime}>0$ and $U^{\prime\prime}<0$ .

Our optimal contracting problem is, therefore,

\begin{array}[]{ll}\underset{I\in\mathfrak{C}}{\text{max}}&\ \ \ {\mathbb{E}}[U(W_{I}(X))]\\ \mbox{subject to}&\ \ \ var\left[e_{I}(X)\right]\leqslant\nu.\end{array}

(2.1)

Note that this model reduces to Arrow (1963)’s model

\max_{I\in\mathfrak{C}}\ \ \ {\mathbb{E}}[U(W_{I}(X))]

by setting the upper bound $\nu$ to be ${\mathbb{E}}[X^{2}]$ . This is because $var[e_{I}(X)]=var\left[I(X)\right]\leqslant{\mathbb{E}}[X^{2}]$ for all $I\in\mathfrak{C}$ .

In Problem (2.1), the insured’s benefit–risk consideration is captured by the utility function $U$ , whereas the insurer’s return–risk tradeoff is reflected by the “mean” (the expected value principle) and the “variance” (the variance bound). One may interpret the problem as one faced by an insurer who likes to design a contract with the best interest of a representative insured in mind, so as to remain marketable and competitive, while maintaining the desired profitability and variance control in the mean–variance sense.⁵⁵5Representative insureds in different wealth classes or different regions may have different “typical” levels of initial wealth. Moreover, when the economy grows, a representative insured’s initial wealth may change substantially. As shown in Subsection 4.1, the change in the insured’s initial wealth may affect her demand for insurance. Problem (2.1) can also model a tailor-made contract design for insuring a one-off event from an insured’s perspective. The insured aims to maximize her expected utility while accommodating the insurer’s participation constraint reflected by the mean and variance specifications.

2.2 Absolute risk aversion and prudence

The Arrow–Pratt measure of absolute risk aversion (Pratt, 1964; Arrow, 1965), defined as

\displaystyle\mathcal{A}(x):=-\frac{U^{\prime\prime}(x)}{U^{\prime}(x)},

captures the dependence of the level of risk aversion on the agent’s wealth $x$ . If $\mathcal{A}(x)$ is decreasing⁶⁶6Throughout the paper, the terms “increasing” and “decreasing” mean “non-decreasing” and “non-increasing,” respectively. in $x$ , then the insured’s risk preference is said to exhibit decreasing absolute risk aversion (DARA). The effect of an insured’s initial wealth on the insurance demand under Arrow (1963)’s model has been widely studied in the literature. It is found that a wealthier DARA insured purchases a deductible insurance with a higher deductible. For a survey on how insureds’ wealth impacts insurance, see e.g., Gollier (2001, 2013).

While risk aversion ( $U^{\prime\prime}<0$ ) captures an insured’s propensity for avoiding risk, prudence (i.e., $U^{\prime\prime\prime}>0$ ) reflects her tendency to take precautions against future risk. Many commonly used utility functions, including those with hyperbolic absolute risk aversion (HARA) and mixed risk aversion, are prudent.⁷⁷7A utility function is called HARA if the reciprocal of the Arrow–Pratt measure of absolute risk aversion is a linear function, i.e., $\mathcal{A}_{U}(x)=-\frac{U^{\prime\prime}(x)}{U^{\prime}(x)}=\frac{1}{px+q}$ for some $p\geqslant 0$ and $q$ . It includes exponential, logarithmic and power utility functions as special cases. For further discussion of HARA, see Gollier (2001). A utility function is said to be of mixed risk aversion if $(-1)^{n}U^{(n)}(x)\leqslant 0$ for all $x$ and $n=1,2,3,\cdots$ , where $U^{(n)}$ denotes the $n$ th derivative of $U$ . Based on an experiment with a large number of subjects, Noussair et al. (2014) observe that the majority of individuals’ decisions are consistent with prudence. Eeckhoudt and Kimball (1992) and Gollier (1996) take into account the insured’s prudence in designing optimal insurance policies. The degree of absolute prudence is defined as

\displaystyle\mathcal{P}(x):=-\frac{U{{}^{\prime\prime\prime}}(x)}{U{{}^{\prime\prime}}(x)}

(2.2)

for a three-time differentiable utility function $U$ . If $\mathcal{P}(x)$ is strictly decreasing in $x$ , then the insured is said to exhibit strictly decreasing absolute prudence (DAP). Kimball (1990) shows that DAP characterizes the notion that wealthier people are less sensitive to future risks. Moreover, DAP implies DARA, as noted in Proposition 21 of Gollier (2001).

A term related to prudence is third-degree stochastic dominance (TSD), which was introduced by Whitmore (1970). A non-negative random variable $Z_{1}$ is said to dominate another non-negative random variable $Z_{2}$ in TSD if

{\mathbb{E}}[Z_{1}]\geqslant{\mathbb{E}}[Z_{2}]\quad\text{and}\quad\int_{0}^{x}\int_{0}^{y}F_{Z_{2}}(z)-F_{Z_{1}}(z){\mathrm{d}}z{\mathrm{d}}y\geqslant 0\,\ \text{for all}\,x\geqslant 0.

Equivalently, $Z_{1}$ dominates $Z_{2}$ in TSD if and only if ${\mathbb{E}}[u(Z_{1})]\geqslant{\mathbb{E}}[u(Z_{2})]$ for all functions $u$ satisfying $u^{\prime}>0,\;u^{\prime\prime}<0$ and $u^{\prime\prime\prime}>0$ . TSD has been widely employed for decision making in finance and insurance. For instance, Gotoh and Konno (2000) use it to study mean-variance optimal portfolio problems. If $Z_{1}$ dominates $Z_{2}$ in TSD and they have the same mean and variance, then $Z_{1}$ is said to have less downside risk than $Z_{2}$ . In fact, the latter is equivalent to ${\mathbb{E}}[u(Z_{1})]\geqslant{\mathbb{E}}[u(Z_{2})]$ for any function $u$ with $u^{\prime\prime\prime}>0$ ; see Menezes et al. (1980).

3 Optimal Contracts

In this section, we present our approach to solving Problem (2.1).

First, consider Problem (2.1) without the variance constraint:

\max_{{I\in\mathfrak{C}}}\ \ \ {\mathbb{E}}[U(W_{I}(X))].

(3.1)

This is the classical Arrow (1963)’s model, for which the optimal contract is a deductible one of the form $(x-d^{*})_{+}$ for some non-negative deductible $d^{*}$ , where $(x)_{+}:=\max\{x,0\}$ . This contract automatically satisfies the incentive-compatible condition. Moreover, Chi (2019) (see Theorem 4.2 therein) was the first to derive an analytical form of the optimal deductible level $d^{*}$ . More precisely, define

VaR_{\frac{1}{1+\rho}}(X):=\inf\left\{x\in[0,\mathcal{M}]:F_{X}(x)\geqslant\frac{\rho}{1+\rho}\right\}

and

\varphi(d):=\frac{{\mathbb{E}}[U^{\prime}(W_{(x-d)_{+}}(X))]}{U^{\prime}(w_{0}-d-\pi((X-d)_{+}))},\quad 0\leqslant d<\mathcal{M},

where $\inf\emptyset:=\mathcal{M}$ by convention. Then the optimal $d^{*}$ is

d^{*}=\sup\left\{VaR_{\frac{1}{1+\rho}}(X)\leqslant d<\mathcal{M}:\,\varphi(d)\geqslant\frac{1}{1+\rho}\right\}\vee VaR_{\frac{1}{1+\rho}}(X),

(3.2)

where $\sup\emptyset:=0$ and $x\vee y:=\max\{x,y\}$ .⁸⁸8The number $d^{*}$ can be numerically computed easily, because $\varphi(d)$ is decreasing over $[VaR_{\frac{1}{1+\rho}}(X),\mathcal{M})$ ; see Chi (2019). This leads immediately to the following proposition.

Proposition 3.1.

If $\nu\geqslant var[(X-d^{*})_{+}]$ , then $I(x)=(x-d^{*})_{+}$ is the optimal solution to Problem (2.1).

Intuitively, if the variance bound $\nu$ is set sufficiently high, then the variance constraint in Problem (2.1) is redundant and the problem reduces to the classical Arrow (1963)’s problem. Proposition 3.1 tells exactly and explicitly what the bound should be for the variance constraint to be binding.

Therefore, it suffices to solve Problem (2.1) for the case in which $\nu<var[(X-d^{*})_{+}]$ , which we now set as an assumption.

Assumption 3.1.

The variance bound $\nu$ satisfies $\nu<var[(X-d^{*})_{+}]$ .

The main thrust for finding the solution is to first restrict the analysis with a fixed level of expected indemnity and then find the optimal level of expected indemnity. To this end, we need to first identify the range in which the optimal expected indemnity possibly lies. Noting that $var[(X-d)_{+}]$ is strictly decreasing and continuous in $d$ over $[{\rm ess\ inf}\;X,\mathcal{M})$ , we define

d_{L}:=\inf\{d\geqslant d^{*}:var[(X-d)_{+}]\leqslant\nu\}\quad\text{and}\quad m_{L}:={\mathbb{E}}[(X-d_{L})_{+}].

Intuitively, the insurer would demand a deductible higher than Arrow’s level $d^{*}$ due to the additional risk control reflected by the variance constraint, and $d_{L}$ is the smallest deductible that makes this constraint binding.

Lemma 3.2.

Under Assumption 3.1, for Problem (2.1), any admissible insurance policy $I$ with ${\mathbb{E}}[I(X)]\leqslant m_{L}$ is no better than the deductible contract $I_{L}$ where $I_{L}(x)=(x-d_{L})_{+}$ .

Therefore, we can rule out any contract whose expected indemnity is strictly smaller than $m_{L}$ ; in other words, $m_{L}$ is a lower bound of the optimal expected indemnity. In particular, no-insurance (i.e., $I^{*}(x)\equiv 0$ ) is never optimal under Assumption 3.1.

Next, we are to derive an upper bound of the optimal expected indemnity. Consider a loss-capped contract $X\wedge k$ , where $x\wedge y:=\min\{x,y\}$ and $k\geqslant 0$ , which pays the actual loss up to the cap $k$ .⁹⁹9A loss-capped contract is also called “full insurance up to a (policy) limit” or “full insurance with a cap.” Define

K_{U}:=\inf\{k\geqslant 0:var[X\wedge k]\geqslant\nu\}\quad\text{and}\quad m_{U}:={\mathbb{E}}[X\wedge K_{U}].

In the above, $K_{U}$ is well-defined because $X\wedge k-{\mathbb{E}}[X\wedge k]$ is increasing in $k$ in the sense of convex order, according to Lemma A.2 in Chi (2012).¹⁰¹⁰10A random variable $Y$ is said to be greater than a random variable $Z$ in the sense of convex order, denoted as $Z\leqslant_{cx}Y$ , if ${\mathbb{E}}[Y]={\mathbb{E}}[Z]\ \text{and}\ \ {\mathbb{E}}[(Z-d)_{+}]\leqslant{\mathbb{E}}[(Y-d)_{+}]\;\;\forall d\in\mathbb{R},$ provided that the expectations exist. Obviously, $Z\leqslant_{cx}Y$ implies $var[Z]\leqslant var[Y]$ . Clearly, both $K_{U}$ and $m_{U}$ depend on the variance bound $\nu$ . Since $var[X]\geqslant var[(X-d^{*})_{+}]>\nu$ , we have

K_{U}<\mathcal{M},\quad m_{U}<{\mathbb{E}}[X]\quad\text{and}\quad var\left[X\wedge K_{U}\right]=\nu.

Lemma 3.3.

For any $I\in\mathfrak{C}$ with $var[I(X)]\leqslant\nu$ , we must have ${\mathbb{E}}[I(X)]\leqslant m_{U}$ . Moreover, if $I\in\mathfrak{C}$ satisfies $var[I(X)]\leqslant\nu$ and ${\mathbb{E}}[I(X)]=m_{U}$ , then $I(X)=X\wedge K_{U}$ almost surely.

This lemma stipulates that $m_{U}$ is an upper bound of the optimal expected indemnity. Moreover, any admissible contract achieving this upper bound is equivalent to the loss-capped contract $X\wedge K_{U}$ . An immediate corollary of the lemma is $m_{L}\leqslant m_{U}$ , noting that $var[(X-d_{L})_{+}]=\nu$ .

The following result identifies the case $m_{L}=m_{U}$ as a trivial one.

Proposition 3.4.

If $m_{L}=m_{U}$ , then the loss $X$ must follow a Bernoulli distribution with values $0$ and $d_{L}+K_{U}$ . Moreover, under Assumption 3.1, the optimal contract of Problem (2.1) is

I^{*}(0)=0\quad\text{and}\quad I^{*}(d_{L}+K_{U})=K_{U}.

In what follows, we consider the general and interesting case in which $m_{L}<m_{U}$ . For $m\in(m_{L},m_{U})$ , define

\mathfrak{C}_{m}:=\left\{I\in\mathfrak{C}:var[I(X)]\leqslant\nu,\,{\mathbb{E}}[I(X)]=m\right\}.

We now focus on the following optimization problem

\max_{I\in\mathfrak{C}_{m}}\ \ {\mathbb{E}}[U(W_{I}(X))],

(3.3)

which is a “cross section” of the original problem (2.1) where the expected indemnity is fixed as $m$ .

For $\lambda\in{\mathbb{R}}$ and $\beta\geqslant 0$ , denote

\displaystyle I_{\lambda,\beta}(x):=\sup\Big{\{}y\in[0,x]:U^{\prime}\left(w_{0}-x+y-(1+\rho)m\right)-\lambda-2\beta y\geqslant 0\Big{\}},\;x\in[0,\mathcal{M}].

(3.4)

Actually, $I_{\lambda,\beta}$ is a contract that coinsures above a deductible or coinsures following full insurance, depending on the relative values between $\lambda$ and $U^{\prime}(w_{0}-(1+\rho)m)$ . To see this, when $\lambda\geqslant U^{\prime}(w_{0}-(1+\rho)m)$ , we have

I_{\lambda,\beta}(x)=\left\{\begin{array}[]{ll}0,&0\leqslant x\leqslant w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda),\\ f_{\lambda,\beta}(x),&w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda)<x\leqslant\mathcal{M},\end{array}\right.

(3.5)

and when $\lambda<U^{\prime}(w_{0}-(1+\rho)m)$ , we have

I_{\lambda,\beta}(x)=\left\{\begin{array}[]{ll}x,&0\leqslant x\leqslant\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta},\\ f_{\lambda,\beta}(x),&\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta}<x\leqslant\mathcal{M},\end{array}\right.

(3.6)

where $f_{\lambda,\beta}(x)$ satisfies the following equation in $y$ :¹¹¹¹11It can be shown easily that this equation has a unique solution.

U^{\prime}(w_{0}-x+y-(1+\rho)m)-\lambda-2\beta y=0.

(3.7)

Moreover, it is easy to see that $0\leqslant f_{\lambda,\beta}(x)\leqslant x$ either when $\lambda\geqslant U^{\prime}(w_{0}-(1+\rho)m)$ and $w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda)<x\leqslant\mathcal{M}$ , or when $\lambda<U^{\prime}(w_{0}-(1+\rho)m)$ and $\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta}<x\leqslant\mathcal{M}$ . Furthermore,

\displaystyle f^{\prime}_{\lambda,\beta}(x)=\frac{-U^{\prime\prime}(w_{0}-x+f_{\lambda,\beta}(x)-(1+\rho)m)}{2\beta-U^{\prime\prime}(w_{0}-x+f_{\lambda,\beta}(x)-(1+\rho)m)}\in(0,1].

(3.8)

The following result indicates that there exists an optimal solution to Problem (3.3) that is in the form of $I_{\lambda,\beta}(x)$ and binds both the mean and variance constraints.

Proposition 3.5.

Suppose Assumption 3.1 holds and $m_{L}<m_{U}$ . Then there exist $\lambda^{*}_{m}\in{\mathbb{R}}$ and $\beta^{*}_{m}>0$ such that $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ satisfies

{\mathbb{E}}[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=m\qquad\text{and}\qquad var[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=\nu,

(3.9)

and is an optimal solution to Problem (3.3).

Combining Lemma 3.2, Lemma 3.3 and Proposition 3.5 yields that we can always find an optimal contract in one of the following three types: a deductible one of the form $I_{L}(x)=(x-d_{L})_{+}$ , a loss-capped one of the form $I_{U}(x)=x\wedge K_{U}$ and a general one of the form $I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)$ . In other words, the optimal solutions of the following maximization problem

\max_{I\in\big{\{}I_{L},\ I_{U},\ I_{\lambda^{*}_{m},\beta^{*}_{m}}\,\text{for}\,m\in(m_{L},m_{U})\big{\}}}{\mathbb{E}}[U(W_{I}(X))],

(3.10)

where $m_{L}<m_{U}$ , also solve Problem (2.1).

Note that $I_{L}$ , $I_{U}$ and $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ all satisfy the incentive-compatible condition (see (3.8)); hence, so does at least one of the optimal contracts $I^{*}$ of (2.1). That is, $I^{*}(0)=0$ and $0\leqslant{I^{*}}^{\prime}(x)\leqslant 1$ almost everywhere.¹²¹²12As will be evident in the sequel, the values of $I^{\prime}$ on a set with zero Lebesgue measure have no impact on $I$ . Therefore, we will often omit the phrase “almost everywhere” in statements regarding the marginal indemnity function ${I}^{\prime}$ throughout this paper. Therefore, it suffices to solve the following maximization problem

\begin{array}[]{ll}\underset{I\in\mathcal{IC}}{\text{max}}&\ \ \ {\mathbb{E}}[U(W_{I}(X))]\\ \mbox{subject to}&\ \ \ var\left[e_{I}(X)\right]\leqslant\nu,\end{array}

(3.11)

where

\mathcal{IC}:=\left\{I:I(0)=0,\,0\leqslant I^{\prime}(x)\leqslant 1,\,\forall x\in[0,\mathcal{M}]\right\}\subsetneq\mathfrak{C},

(3.12)

to obtain an optimal contract for Problem (2.1).

Notice that $\mathcal{IC}$ is convex on which ${\mathbb{E}}[U(W_{I}(X))]$ is strictly concave. Using the convex property of variance and applying arguments similar to those in the proof of Proposition 3.1 in Chi and Wei (2020), we obtain the following proposition:

Proposition 3.6.

(i)

There exist optimal solutions to Problem (3.11).
(ii)

Assume either $\rho>0$ or $\mathbb{P}\left\{X<\epsilon\right\}>0$ for all $\epsilon>0$ . Then there exists a unique solution to Problem (3.11) in the sense that $I_{1}(X)=I_{2}(X)$ almost surely for any two solutions $I_{1}$ and $I_{2}$ .

Note that the assumptions in Proposition 3.6-(ii) are satisfied in most situations of practical interest because either an insurer naturally sets a positive safety loading, or a loss actually never occurs with a positive probability, or both happen. On the other hand, since any optimal solution to Problem (3.11) also solves Problem (2.1), Proposition 3.6-(i) establishes the existence of optimal solutions to the latter.¹³¹³13It is difficult to prove the existence of solutions to Problem (2.1) directly because its feasible set is not compact only under the principle of indemnity. Moreover, the argument proving Proposition 3.1 in Chi and Wei (2020) can be used to show that Proposition 3.6-(ii) holds true for Problem (2.1) as well. Finally, now that we have the existence and uniqueness of the optimal solutions for both Problems (3.11) and (2.1), we conclude that these two problems are indeed equivalent under the assumptions of Proposition 3.6-(ii).

While the analysis of Problem (2.1) is simplified to Problem (3.10), it remains challenging to solve this problem because $\lambda^{*}_{m}$ and $\beta^{*}_{m}$ are implicit functions of $m$ . Before attacking this problem, we introduce a useful result that provides a general qualitative structure for the optimal indemnity function in Problem (3.11) or, equivalently, Problem (2.1).

Proposition 3.7.

Under Assumption 3.1, if $I^{*}$ is a solution to Problem (3.11), then there exists $\beta^{*}>0$ such that

{I^{*}}^{\prime}(x)=\left\{\begin{array}[]{ll}1,&\Phi_{I^{*}}(x)>0,\\ c_{I^{*}}(x),&\Phi_{I^{*}}(x)=0,\\ 0,&\Phi_{I^{*}}(x)<0,\end{array}\right.

(3.13)

for some function $c_{I^{*}}$ bounded on $[0,1]$ , where

\Phi_{I}(x):={\mathbb{E}}\big{[}U^{\prime}(W_{I}(X))-2\beta^{*}I(X)|X>x\big{]}-\big{(}(1+\rho){\mathbb{E}}\big{[}U^{\prime}(W_{I}(X))\big{]}-2\beta^{*}I(X)\big{)},\;x\in[0,\mathcal{M})

(3.14)

for $I\in\mathcal{IC}$ .

Note that (3.13) does not entail an explicit expression of ${I^{*}}^{\prime}$ because its right hand side also depends on $I^{*}$ as well as on an unknown parameter $\beta^{*}$ . While deriving the optimal solution $I^{*}$ directly from (3.13) seems challenging, the equation reveals the important property that ${I^{*}}^{\prime}$ must take a value of either 0 or 1, except at point(s) $x$ where $\Phi_{I^{*}}(x)=0$ .¹⁴¹⁴14From the control theory perspective, (3.13) corresponds to an optimal control problem in which ${I}^{\prime}$ is taken as the control variable. Moreover, the optimal control turns out to be of the so-called “bang-bang” type, whose values depend on the sign of the discriminant function $\Phi_{I}$ . This type of optimal control problem arises when the Hamiltonian depends linearly on control and the control is constrained between an upper bound and a lower bound. It is usually hard to solve for optimal control when the discriminant function is complex, which is the case here. This property will in turn help us to decide whether the optimal contract is of the form $(x-d_{L})_{+},\ x\wedge K_{U}$ , or $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ .

The following theorem presents a complete solution to Problem (2.1).

Theorem 3.8.

Under Assumption 3.1 and assume that the c.d.f. $F_{X}$ is strictly increasing on $(0,\mathcal{M})$ . We have the following conclusions:

(i)

If $\rho=0$ , then the optimal indemnity function is $I^{*}$ , where $I^{*}(x)$ solves the following equation in $y$ for all $x\in(0,\mathcal{M}]$ :

$U^{\prime}(w_{0}-x+y-m^{*})-2\beta^{*}y-U^{\prime}(w_{0}-m^{*})=0,\,\;y\in(0,x),$ (3.15)

with the parameters $m^{*}\in(m_{L},m_{U})$ and $\beta^{*}>0$ determined by

${\mathbb{E}}[I^{*}(X)]=m^{*}\quad\text{and}\quad var[I^{*}(X)]=\nu.$ (3.16)

(ii)

If $\rho>0$ , then the optimal indemnity function is

I^{*}(x)=\left\{\begin{array}[]{ll}0,&0\leqslant x\leqslant\tilde{d},\\ f^{*}(x),&\tilde{d}<x\leqslant\mathcal{M},\end{array}\right.

(3.17)

where $f^{*}(x)$ satisfies $f^{*}(\tilde{d})=0$ and solves the following equation in $y$ :

	$\displaystyle U^{\prime}(w_{0}-(1+\rho)m^{}-x+y)-U^{\prime}(w_{0}-(1+\rho)m^{}-\tilde{d})$		(3.18)
	$\displaystyle\quad=\frac{y}{m^{}\rho}\left(U^{\prime}(w_{0}-(1+\rho)m^{}-\tilde{d})-(1+\rho){\mathbb{E}}[U^{\prime}(w_{0}-(1+\rho)m^{*}-X\wedge\tilde{d})]\right),\,\;y\in(0,x),$

and $\tilde{d}\in(VaR_{\frac{1}{1+\rho}}(X),\mathcal{M})$ and $m^{*}\in(m_{L},m_{U})$ are determined by (3.16).

Theorem 3.8 provides a complete solution to Problem (2.1). It indicates that the optimal contract can not be a pure deductible of the form $(x-d_{L})_{+}$ , nor a pure loss-capped of the form $x\wedge K_{U}$ . It can only be in the form $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ of (3.5) (rather than (3.6)). The optimal policies can be computed by solving a system of three algebraic equations; so the result is semi-analytic.

Actuarially, Theorem 3.8 reveals how the variance bound impacts the contract. When the bound $\nu$ is sufficiently low so that it is binding (hence the model does not degenerate into the classical Arrow 1963’s model), the optimal policy is always genuine coinsurance if there is no safety loading. Here, by “genuine” we mean the strict inequalities $0<I^{*}(x)<x$ for all $x\in(0,\mathcal{M}]$ , namely both the insurer and the insured pay positive portions of the loss incurred. When the safety loading coefficient is positive, the optimal contract demands genuine coinsurance above a positive deductible. So the variance bound translates into a change from the part of the full insurance in Arrow’s contract to coinsurance. Our contracts are similar qualitatively to those of Raviv (1979), in which a utility function is in the place of a variance bound; however, ours are quantitatively different from Raviv (1979)’s.

On the other hand, the deductible $\tilde{d}$ is positive if and only if the safety loading coefficient is positive. So the existence of the deductible is completely determined by the loading coefficient in the insurance premium. This result is consistent with Mossin’s Theorem (Mossin 1968).

Corollary 3.9.

Under the assumptions of Theorem 3.8, if the insured is prudent, then the proportion between optimal indemnity and loss increases as loss increases.

So, with a prudent insured, the insurer pays more not only absolutely but also relatively as loss increases. Vajda (1962) restricts his study on a variance contracting problem to policies that have this feature of the insurer covering proportionally more for larger losses. Corollary 3.9 uncovers ex post this feature in our optimal policies, provided that the insured is prudent.

4 Comparative Statics

Thanks to the semi-analytic results derived in the previous section, we are able to analyze the impacts of the insured’s initial wealth and the variance bound on the insurance demand.

We make the following assumptions for our comparative statics analysis:

Assumption 4.1.

(i)

$F_{X}$ is strictly increasing on $(0,\mathcal{M})$ .
(ii)

The insurance is fairly priced, i.e., $\rho=0$ .

Assumption 4.1-(i) is standard in the literature that accommodates most of the used distributions by actuaries, such as exponential, lognormal, gamma, and Pareto distributions. Assumption 4.1-(ii) is not necessarily plausible in practice, but it is meaningful in theory, as it describes a state in competitive equilibrium, in which insurers break even and insurance policies are actuarially fair for representative insureds (see e.g., Rothschild and Stiglitz, 1976; Viscusi, 1979). It is important to carry out comparative statics analyses in such a “fair” state in order to rule out any impact emanating from an unfair price. Such an assumption is indeed often imposed when conducting comparative statics in the literature of insurance economics. For example, the comparative statics results of Ehrlich and Becker (1972) and Viscusi (1979) deal exclusively with actuarially fair situations. Many recent studies, such as Eeckhoudt et al. (2003), Huang and Tzeng (2006) and Teh (2017), also impose this assumption for their comparative statics analyses.

Finally, we will assume $\nu<var[X]$ throughout this section, as otherwise the variance constraint is redundant and the optimal solution is trivially full insurance.

4.1 Impact of the insured’s initial wealth

In this subsection we examine the impact of the insured’s initial wealth on insurance demand. We first recall the notion of one function up-crossing another. A function $g_{1}$ is said to up-cross a function $g_{2}$ , both defined on ${\mathbb{R}}$ , if there exists $z_{0}\in{\mathbb{R}}$ such that

\left\{\begin{array}[]{ll}g_{1}(x)\leqslant g_{2}(x),&\ x<z_{0},\\ g_{1}(x)\geqslant g_{2}(x),&\ x\geqslant z_{0}.\end{array}\right.

Moreover, $g_{1}$ is said to up-cross $g_{2}$ twice if there exist $z_{0}<z_{1}$ such that

\left\{\begin{array}[]{ll}g_{1}(x)\leqslant g_{2}(x),&\ x<z_{0},\\ g_{1}(x)\geqslant g_{2}(x),&\ z_{0}\leqslant x<z_{1},\\ g_{1}(x)\leqslant g_{2}(x),&\ x\geqslant z_{1}.\end{array}\right.

Consider two initial wealth levels $w_{1}<w_{2}$ and denote the corresponding optimal contracts by $I_{1}^{*}$ and $I_{2}^{*}$ and the associated parameters by $\beta^{*}_{1}$ and $\beta_{2}^{*}$ , respectively, which are determined by Theorem 3.8. Recall that $\rho=0$ ; so the insurer’s risk exposure functions are

e_{I_{i}^{*}}(x)=I_{i}^{*}(x)-{\mathbb{E}}[I_{i}^{*}(X)]=I_{i}^{*}(x)-m_{i}^{*},\;\;i=1,2,

(4.1)

where $m_{i}^{*}:={\mathbb{E}}[I_{i}^{*}(X)]$ . Taking expectations on (3.15) yields

U^{\prime}(w_{i}-m_{i}^{*})={\mathbb{E}}[U^{\prime}(w_{i}-X+e_{I_{i}^{*}}(X))]-2\beta_{i}^{*}m_{i}^{*},

which in turn implies, for $i=1,2$ ,

\displaystyle U^{\prime}(w_{i}-x+e_{I_{i}^{*}}(x))-2\beta_{i}^{*}e_{I_{i}^{*}}(x)-{\mathbb{E}}[U^{\prime}(w_{i}-X+e_{I_{i}^{*}}(X))]=0,

(4.2)

\displaystyle{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0\quad\text{and}\ \ \ {\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]=var[I_{i}^{*}(X)]=\nu.

(4.3)

Note that the insurer’s profit with the contract $I_{i}$ is ${\mathbb{E}}[I_{i}^{*}(X)]-I_{i}^{*}(X)\equiv-e_{I_{i}^{*}}(X)$ , $i=1,2$ . The following theorem establishes the impact of the initial wealth on the insurance contract.

Theorem 4.1.

In addition to Assumption 4.1, we assume that $\nu<var[X]$ and the insured’s utility function $U$ exhibits strictly DAP. Then, the insurer’s risk exposure function with the larger initial wealth, $e_{I_{2}^{*}}(x)$ , up-crosses the risk exposure function with the smaller initial wealth, $e_{I_{1}^{*}}(x)$ , twice. Moreover, the insurer’s profit, $-e_{I_{2}^{*}}(X)$ , has less downside risk when contracting with the wealthier insured.

Refer to caption — Figure 4.1: Comparison of two insurer’s risk exposure functions with $w_{1}<w_{2}$ .

Figure 4.1 illustrates graphically the first part of Theorem 4.1. The actuarial implication is that when the insured becomes wealthier, the insurer’s risk exposure is lower for large or small losses and is higher for moderate losses. This can be explained intuitively as follows. Even if the insurance pricing is actuarially fair, the insureds are unable to transfer all the risk to the insurer due to the variance bound. However, the wealthier insured is more tolerant with large losses due to the DAP; hence, the insurer’s risk exposure is lower for large losses when contracting with the wealthier insured. Due to the requirement that the insurer’s expected risk exposure be always zero, the insurer’s risk exposure with the wealthier insured must be higher for moderate losses. Now, should the insurer’s risk exposure with the wealthier insured also be higher for small losses, then overall the insurer’s risk exposure with the wealthier insured would be strictly more spread out than that with the less wealthy one, leading to a smaller variance of the former, which would be a contradiction. Hence, the insurer’s risk exposure must be lower for small losses with the wealthier insured.

The second part of the theorem, on the other hand, suggests that a variance minding insurer prefers to provide insurance to a wealthier insured due to the smaller downside risk. Such a finding may shed light on why insurers underwrite relatively more business in developed countries or, in a same country, engage more business when the economy improves.¹⁵¹⁵15For example, Hofmann (2015), an industry report from the insurance company Zurich, shows that both insurance densities (premiums per capita) and insurance penetrations (premiums as a percent of GDP) of advanced economies are much higher than those of emerging economies. This report also demonstrates that insurance markets in both advanced and emerging economies experience rapid growth when the economies grow.

Corollary 4.2.

Under the assumptions of Theorem 4.1, we have the following conclusions:

(i)

${\mathbb{E}}[I_{1}^{*}(X)]<{\mathbb{E}}[I_{2}^{*}(X)]$ and $\beta_{2}^{*}<\beta_{1}^{*}$ .
(ii)

Either $I^{*}_{1}(x)<I^{*}_{2}(x)$ $\forall x>0$ , or $I^{*}_{1}$ up-crosses $I^{*}_{2}$ .

In part (ii) of this corollary, while the case $I^{*}_{1}(x)<I^{*}_{2}(x)$ $\forall x>0$ is a special case of $I^{*}_{1}$ up-crossing $I^{*}_{2}$ , we state it separately to highlight its possibility. In the classical Arrow (1963)’s model, full insurance is optimal when insurance pricing is actuarially fair. This conclusion is independent of an insured’s worth. Zhou et al. (2010) show that such a conclusion is intact when there is an exogenous upper limit imposed on the insurer’s risk exposure. Our result yields that adding a variance bound fundamentally changes the insurance demand – it makes the insured’s wealth level relevant, and it changes the way in which the two parties share the risk. Specifically, Corollary 4.2 suggests that a DAP wealthier insured would either demand more coverage across the board or retain more larger risk and cede more smaller risk (see Figure 4.2). Either way, the expected coverage is always larger for the wealthier insured. Recall that insurance is called a normal (inferior) good if wealthier people purchase more (less) insurance converge; see Mossin (1968), Schlesinger (1981) and Gollier (2001). Millo (2016) argues that nonlife insurance is a normal good by empirically testing whether or not income elasticity is significantly greater than one. Armantier et al. (2018) use micro level survey data on households’ insurance coverage to conclude that insurance is a normal good, thereby providing a better understanding of the relationship between insurance demand and economic development. These studies, however, are purely empirical. To the best of our knowledge, ours is the first theoretical result regarding insurance as a normal good under the insurer’s variance constraint, confirming these empirical findings.¹⁶¹⁶16Mossin (1968), Schlesinger (1981) and Gollier (2001) show that a wealthier insured with a DARA preference will cede less risk under unfair insurance pricing; hence, insurance is an inferior good in the corresponding economy. Their results degenerate into full insurance when the pricing is fair, and thus insurance demand is independent of the insured’s wealth. Consequently, our results do not contradict theirs.

4.2 Impact of the variance bound

In this subsection we keep the insured’s initial wealth unchanged and analyze the impact of the variance bound on her demand for insurance. Consider two variance bounds with $0<\nu_{1}<\nu_{2}<var[X]$ and denote the corresponding optimal indemnity functions by $I_{1}^{*}$ and $I_{2}^{*}$ and the parameters by $\beta^{*}_{1}$ and $\beta_{2}^{*}$ , respectively. Thus, the insurer’s risk exposures, $e_{I_{i}^{*}}(x)=I_{i}^{*}(x)-{\mathbb{E}}[I_{i}^{*}(X)]$ , $i=1,2,$ satisfy

\displaystyle U^{\prime}(w_{0}-x+e_{I_{i}^{*}}(x))-2\beta_{i}^{*}e_{I_{i}^{*}}(x)-{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{i}^{*}}(X))]=0

(4.4)

and

\displaystyle{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0\quad\text{and}\ \ \ {\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]=var[e_{I_{i}^{*}}(X)]=\nu_{i}.

(4.5)

The following theorem illustrates how the insurer’s risk exposure responds to the change in the variance bound.

Theorem 4.3.

Under Assumption 4.1, the insurer’s risk exposure function with the larger variance bound, $e_{I_{2}^{*}}$ , up-crosses that with the smaller variance bound, $e_{I_{1}^{*}}$ .

Under fair insurance pricing, this theorem indicates that, as the variance bound decreases, the insurer is exposed to less risk for a larger $X$ and to more risk for a smaller $X$ . This result has a rather significant implication in terms of the insurer’s tail risk management. A variance constraint by its very definition does not control the tail risk directly. However, Theorem 4.3 suggests that the insurer can reduce the risk exposure for larger losses simply by tightening the variance constraint.¹⁷¹⁷17It follows from Lemma A.3 that the insurer with a more relaxed variance constrant suffers more underwriting risk in the sense of convex order, i.e., $e_{I^{*}_{1}}(X)\leqslant_{cx}e_{I^{*}_{2}}(X)$ . This further justifies our formulation of the variance contracting model.

Corollary 4.4.

Under the assumption of Theorem 4.3, for any $0<\nu_{1}<\nu_{2}<var[X]$ , we have the following conclusions:

(i)

${\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]$ and $\beta_{2}^{*}<\beta_{1}^{*}$ ;
(ii)

If the insured’s utility function satisfies $U^{\prime\prime\prime}\geqslant 0$ , then $I_{1}^{*}(x)<I_{2}^{*}(x)$ $\forall x>0.$

Corollary 4.4-(i) can be easily interpreted: an insurer with a tighter variance bound offers less expected coverage. As a complement to Theorem 4.3, Corollary 4.4-(ii) establishes a direct characterization of the insured’s optimal risk transfer with regard to the change in the variance bound: A prudent insured consistently cedes more losses when the variance bound increases (see Figure 4.3). In other words, if the insurance contract is priced fairly, the insured will transfer as much risk to the insurer as the latter’s risk tolerance allows.

5 Concluding Remarks

In this paper, we have revisited the classical Arrow (1963)’s model by adding a variance limit on the insurer’s risk exposure. This constraint is motivated by the insurer’s desire to manage underwriting risk; at the same time, it poses considerable technical challenges for solving the problem. We have developed an approach to derive optimal contracts semi-analytically, in the form of coinsurance above a deductible when the variance constraint is active. The final policies automatically satisfy the incentive-compatible condition, thereby eliminating potential ex post moral hazard. We have also conducted a comparative analysis to examine the impact of the insured’s wealth and of the variance bound on insurance demand.

This work can be extended in a couple of directions. First, we have restricted the comparative analysis to actuarially fair insurance. Analyzing for the general unfair case calls for a different approach than the one presented here. Second, a model incorporating probability distortion (weighting) is of significant interest, both theoretically and practically. This is because probability distortion, a phenomenon well documented in psychology and behavioral economics, is related to tail events, about which both insurers and insureds have great concerns.

Appendices

Appendix A Stochastic Orders

Since the notion of stochastic orders plays an important role in this paper, we present, in this appendix, some useful results in this regard.

A random variable $Y$ is said to be greater than a random variable $Z$ in the sense of stop-loss order, denoted as $Z\leqslant_{sl}Y$ , if

{\mathbb{E}}[(Z-d)_{+}]\leqslant{\mathbb{E}}[(Y-d)_{+}]\;\;\forall d\in\mathbb{R},

provided that the expectations exist. It follows readily that $Y$ is greater than $Z$ in convex order (i.e., $Z\leqslant_{cx}Y$ ), if ${\mathbb{E}}[Y]={\mathbb{E}}[Z]$ and $Z\leqslant_{sl}Y$ .

A useful way to verify the stop-loss order is the well–known Karlin–Novikoff criterion (Karlin and Novikoff 1963).

Lemma A.1.

Suppose ${\mathbb{E}}[Z]\leqslant{\mathbb{E}}[Y]<\infty$ . If $F_{Z}$ up-crosses $F_{Y}$ , then $Z\leqslant_{sl}Y$ .

If $Z\leqslant_{sl}Y$ , then ${\mathbb{E}}[g(Z)]\leqslant{\mathbb{E}}[g(Y)]$ holds for all the increasing convex functions $g$ , provided that the expectations exist. Based on the Karlin–Novikoff criterion, Gollier and Schlesinger (1996) obtain the following lemma.

Lemma A.2.

For any $h\in\mathfrak{C}$ , we have $X\wedge d\leqslant_{cx}h(X)$ , where $d\in[0,\mathcal{M}]$ satisfies ${\mathbb{E}}[X\wedge d]={\mathbb{E}}[h(X)]$ .

The following result with respect to convex order is from Lemma 3 of Ohlin (1969).

Lemma A.3.

Let $Y$ be a random variable and $h_{i},\;i=1,2,$ be two increasing functions with ${\mathbb{E}}[h_{1}(Y)]={\mathbb{E}}[h_{2}(Y)]$ . If $h_{1}$ up-crosses $h_{2}$ , then $h_{2}(Y)\leqslant_{cx}h_{1}(Y)$ .

Appendix B Other Useful Lemmas

This appendix presents some other technical results that are useful in connection with this paper.

It is easy to verify that any sequence of indemnity functions in $\mathcal{IC}$ is uniformly bounded and equicontinuous over $[0,\mathcal{M}]$ . Hence, the Arzéla-Ascoli theorem implies

Lemma B.1.

The set $\mathcal{IC}$ is compact under the norm $d(I_{1},I_{2})=\max_{t\in[0,\mathcal{M}]}\mid I_{1}(t)-I_{2}(t)\mid$ , $I_{1},I_{2}\in\mathcal{IC}$ .

For the following lemma, one can refer to Komiya (1988) for a proof.

Lemma B.2.

(Sion’s Minimax Theorem) Let Y be a compact convex subset of a linear topological space and Z a convex subset of a linear topological space. If $\Gamma$ is a real-valued function on $Y\times Z$ such that $\Gamma(y,\cdot)$ is continuous and concave on $Z$ for any $y\in Y$ and $\Gamma(\cdot,z)$ is continuous and convex on $Y$ for any $z\in Z$ , then

\min\limits_{{y\in Y}}\ \max\limits_{{z\in Z}}\Gamma(y,z)=\max\limits_{{z\in Z}}\ \min\limits_{{y\in Y}}\Gamma(y,z).

The following lemmas are needed in the comparative analysis.

Lemma B.3.

If a non-negative increasing function $h_{1}$ up-crosses a non-negative increasing function $h_{2}$ with ${\mathbb{E}}[(h_{1}(X))^{2}]={\mathbb{E}}[(h_{2}(X))^{2}]$ , then either $h_{1}(X)$ and $h_{2}(X)$ have the same distribution or ${\mathbb{E}}[h_{1}(X)]<{\mathbb{E}}[h_{2}(X)]$ .

Proof.

If ${\mathbb{E}}[h_{2}(X)]\leqslant{\mathbb{E}}[h_{1}(X)]$ , then Lemma A.1 implies $h_{2}(X)\leqslant_{sl}h_{1}(X)$ . Moreover, we have

{\mathbb{E}}[(h_{i}(X))^{2}]=2\int_{0}^{\infty}{\mathbb{E}}[(h_{i}(X)-t)_{+}]{\mathrm{d}}t.

Since it is assumed that ${\mathbb{E}}[(h_{1}(X))^{2}]={\mathbb{E}}[(h_{2}(X))^{2}]$ , we must have ${\mathbb{E}}[(h_{1}(X)-t)_{+}]={\mathbb{E}}[(h_{2}(X)-t)_{+}]$ for any $t\geqslant 0$ . It then follows from the equation ${\mathbb{E}}[(h_{i}(X)-t)_{+}]=\int_{t}^{\infty}(1-F_{h_{i}(X)}(y)){\mathrm{d}}y$ that $h_{1}(X)$ and $h_{2}(X)$ have the same distribution. ∎

Lemma B.4.

Under the assumption of Theorem 4.1, it is impossible that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ have the same distribution, and it is impossible either $e_{I_{1}^{*}}$ up-crosses $e_{I_{2}^{*}}$ or $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ , where $e_{I_{i}^{*}}$ is given in (4.1).

Proof.

First of all, we show that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ cannot have the same distribution. Define

\displaystyle\phi(z):=\frac{-U^{\prime\prime}(w_{1}-z)}{-U^{\prime\prime}(w_{2}-z)},\,\ 0\leqslant z<w_{1}<w_{2}.

(B.1)

A direct calculation based on the assumption of strictly DAP shows that

\phi^{\prime}(z)=(\mathcal{P}(w_{1}-z)-\mathcal{P}(w_{2}-z))\phi(z)>0,

where $\mathcal{P}$ is defined in (2.2). As a result, $\phi$ is a strictly increasing function.

We now prove the result by contradiction. Assume that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ are equal in distribution. Noting that $e_{I_{i}^{*}}$ is increasing and Lipschitz-continuous and that $F_{X}(x)$ is strictly increasing, we have

e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x),\,\forall x\in[0,\mathcal{M}],

which in turn implies $e_{I_{1}^{*}}^{\prime}(x)=e_{I_{2}^{*}}^{\prime}(x).$ It follows from (4.2) that $e_{I_{i}^{*}}^{\prime}(x)=\frac{1}{1+2\beta_{i}^{*}/(-U^{\prime\prime}(w_{i}-x+e_{I_{i}^{*}}(x))}$ for all $x\geqslant 0$ ; hence,

\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{1}-x+e_{I_{1}^{*}}(x))}=\frac{2\beta_{2}^{*}}{-U^{\prime\prime}(w_{2}-x+e_{I_{2}^{*}}(x))}.

Because $e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)$ for all $x\in[0,\mathcal{M}]$ , we obtain

\phi(x-e_{I_{1}^{*}}(x))=\frac{-U^{\prime\prime}(w_{1}-x+e_{I_{1}^{*}}(x))}{-U^{\prime\prime}(w_{2}-x+e_{I_{1}^{*}}(x))}=\frac{\beta_{1}^{*}}{\beta_{2}^{*}},\,\forall x\in[0,\mathcal{M}],

which contradicts the fact that $x-e_{I_{1}^{*}}(x)\equiv x-I_{1}^{*}(x)+{\mathbb{E}}[I_{1}^{*}(X)]$ is strictly increasing in $x\in[0,\mathcal{M}]$ and that $\phi$ is a strictly increasing function.

Next we show that it is impossible that $e_{I_{1}^{*}}$ up-crosses $e_{I_{2}^{*}}$ . Again we prove the result by contradiction. Since $e_{I_{1}^{*}}$ is not always non-negative, we introduce the increasing function

\tilde{f}_{i}^{*}(x):=e_{I_{i}^{*}}(x)+\widetilde{m},\;\;i=1,2,

where $\widetilde{m}:=\max\left\{{\mathbb{E}}[I_{1}^{*}(X)],{\mathbb{E}}[I_{2}^{*}(X)]\right\}$ . It then follows that $\tilde{f}_{1}^{*}$ up-crosses $\tilde{f}_{2}^{*}$ , and

\tilde{f}_{i}^{*}(x)\geqslant 0,\quad{\mathbb{E}}[\tilde{f}_{i}^{*}(X)]=\widetilde{m},\quad{\mathbb{E}}[(\tilde{f}_{i}^{*}(X))^{2}]={\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]+\widetilde{m}^{2}=\nu+\widetilde{m}^{2},\;\;i=1,2,

where the second equality follows from the fact that ${\mathbb{E}}[e_{I_{i}^{*}}(X)]=0$ . Lemma B.3 yields that $\tilde{f}_{1}^{*}(X)$ and $\tilde{f}_{2}^{*}(X)$ have the same distribution, and hence so do $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ . As shown above, $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ cannot have the same distribution, and therefore $e_{I_{1}^{*}}$ cannot up-cross $e_{I_{2}^{*}}$ . A similar analysis shows that it is impossible that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ . The proof is thus complete. ∎

Appendix C Proofs

Proof of Lemma 3.2:

For any $I\in\mathfrak{C}$ with ${\mathbb{E}}[I(X)]\leqslant m_{L}$ , it follows from Lemma A.2 that

X\wedge d_{I}\leqslant_{cx}R_{I}(X),

where $d_{I}\geqslant d_{L}$ is determined by ${\mathbb{E}}[(X-d_{I})_{+}]={\mathbb{E}}[I(X)]$ (or, equivalently, ${\mathbb{E}}[X\wedge d_{I}]={\mathbb{E}}[R_{I}(X)]$ ). Thus, we have ${\mathbb{E}}[U(W_{I}(X))]\leqslant{\mathbb{E}}[U(W_{(x-d_{I})_{+}}(X))]$ . Furthermore, according to the proof of Theorem 4.2 in Chi (2019), ${\mathbb{E}}[U(W_{(x-d)_{+}}(X))]$ is a decreasing function of $d$ over $[d^{*},\mathcal{M})$ , where $d^{*}$ is defined in (3.2). Recalling that $d_{L}\geqslant d^{*}$ , we conclude that $I$ is no better than $(x-d_{L})_{+}$ .

Proof of Lemma 3.3:

We prove by contradiction. Assume ${\mathbb{E}}[I(X)]>m_{U}$ for some indemnity function $I\in\mathfrak{C}$ satisfying $var[I(X)]\leqslant\nu$ . Lemma A.2 implies that there exists $K>K_{U}$ such that

X\wedge K\leqslant_{cx}I(X).

Noting that $var[X\wedge K]$ is strictly increasing and continuous in $K\in[K_{U},\mathcal{M}]$ , we obtain

\nu=var[X\wedge K_{U}]<var[X\wedge K]\leqslant var[I(X)]\leqslant\nu,

leading to a contradiction.

For any $I\in\mathfrak{C}$ satisfying $var[I(X)]\leqslant\nu$ and ${\mathbb{E}}[I(X)]=m_{U}$ , it follows from Lemma A.2 that

X\wedge K_{U}\leqslant_{cx}I(X),

which in turn implies

\nu=var[X\wedge K_{U}]\leqslant var[I(X)]\leqslant\nu.

Because

var[Z]=2\int_{-\infty}^{\infty}\big{\{}{\mathbb{E}}[(Z-t)_{+}]-({\mathbb{E}}[Z]-t)_{+}\big{\}}{\mathrm{d}}t

for any random variable $Z$ with a finite second moment, we deduce ${\mathbb{E}}[(X\wedge K_{U}-t)_{+}]={\mathbb{E}}[(I(X)-t)_{+}]$ almost everywhere in $t$ . Therefore, $X\wedge K_{U}$ and $I(X)$ are equally distributed, which implies $\mathbb{P}(I(X)\leqslant K_{U})=1$ . Moreover, it follows from $I\in\mathfrak{C}$ that $\mathbb{P}(I(X)\leqslant X\wedge K_{U})=1$ . Since ${\mathbb{E}}[I(X)]={\mathbb{E}}[X\wedge K_{U}]$ , we obtain that $I(X)=X\wedge K_{U}$ almost surely. The proof is thus complete.

Proof of Proposition 3.4:

Because $m_{L}=m_{U}$ , it follows from Lemma 3.3 that $X\wedge K_{U}=(X-d_{L})_{+}$ almost surely. Thus, it follows from the fact that $d_{L}>0$ that $X$ must follow a Bernoulli distribution with values $0$ and $d_{L}+K_{U}$ . Furthermore, Lemmas 3.2 and 3.3 imply that any admissible insurance policy is no better than $(x-d_{L})_{+}$ . Therefore, the optimal indemnity must be $K_{U}$ at point $d_{L}+K_{U}$ .

Proof of Proposition 3.5:

Introducing two Lagrangian multipliers $\lambda\in{\mathbb{R}}$ and $\beta\geqslant 0$ , we consider the following maximization problem:

\max_{I\in\mathfrak{C}}U_{\lambda,\beta}(I):={\mathbb{E}}\Big{[}U(w_{0}-X+I(X)-(1+\rho)m)-\lambda({\mathbb{E}}[I(X)]-m)-\beta({\mathbb{E}}[I^{2}(X)]-\nu-m^{2})\Big{]}.

(C.1)

Fix $x\geqslant 0$ . The above objective function motivates the introduction of the following function:

\Psi(y):=U(w_{0}-x+y-(1+\rho)m)-\lambda y-\beta y^{2},\,0\leqslant y\leqslant x.

The assumption of $U^{\prime\prime}<0$ implies that $\Psi$ is strictly concave with

\Psi^{\prime}(y)=U^{\prime}(w_{0}-x+y-(1+\rho)m)-\lambda-2\beta y.

As a consequence, for each $x\geqslant 0$ , $I_{\lambda,\beta}(x)$ defined in (3.4) is an optimal solution to

\max_{0\leqslant y\leqslant x}\Psi(y).

This result, together with the fact that $I_{\lambda,\beta}\in\mathfrak{C}$ , implies that $I_{\lambda,\beta}$ solves Problem (C.1).

Notably, if there exist $\lambda^{*}_{m}\in{\mathbb{R}}$ and $\beta^{*}_{m}>0$ such that

{\mathbb{E}}[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=m\qquad\text{and}\qquad var[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=\nu,

(C.2)

then $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ solves Problem (3.3). We prove this by contradiction. Indeed, if there exists $I_{*}\in\mathfrak{C}_{m}$ such that ${\mathbb{E}}[U(W_{I_{*}}(X))]>{\mathbb{E}}[U(W_{I_{\lambda^{*}_{m},\beta^{*}_{m}}}(X))]$ , then we can obtain

U_{\lambda^{*}_{m},\beta^{*}_{m}}(I_{*})>U_{\lambda^{*}_{m},\beta^{*}_{m}}(I_{\lambda^{*}_{m},\beta^{*}_{m}}),

which contradicts the fact that $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ solves Problem (C.1) with $\lambda=\lambda^{*}_{m}$ and $\beta=\beta^{*}_{m}.$ Therefore, we only need to show the existence of $\lambda^{*}_{m}$ and $\beta^{*}_{m}$ .

It follows from (3.5)-(3.8) that $I_{\lambda,\beta}$ satisfies the incentive-compatible condition, i.e., $I_{\lambda,\beta}\in\mathcal{IC}$ . Such an observation motivates us to consider an auxiliary problem

\max_{I\in\mathcal{IC}_{m}}\ \ {\mathbb{E}}[U(W_{I}(X))],

(C.3)

where

\mathcal{IC}_{m}:=\left\{I(x)\in\mathcal{IC}:var[I(X)]\leqslant\nu,\,{\mathbb{E}}[I(X)]=m\right\}.

Problem (C.3) differs from Problem (3.3) in that the feasible set is $\mathcal{IC}_{m}$ instead of $\mathfrak{C}_{m}$ . In what follows, we show that $\mathcal{IC}_{m}\neq\emptyset$ , and that there exists a unique optimal solution $I^{*}_{m}$ to Problem (C.3) satisfying $var[I^{*}_{m}(X)]=\nu$ . Indeed, for any $m\in(m_{L},m_{U})$ , let $\theta\in(0,1)$ be such that $m=\theta m_{U}+(1-\theta)m_{L}$ . Denote

I_{\theta}(x):=\theta(x\wedge K_{U})+(1-\theta)(x-d_{L})_{+}.

It follows that ${\mathbb{E}}[I_{\theta}(X)]=m$ and

\sqrt{var[I_{\theta}(X)]}\leqslant\theta\sqrt{var[X\wedge K_{U}]}+(1-\theta)\sqrt{var[(X-d_{L})_{+}]}=\sqrt{\nu}.

As a consequence, $I_{\theta}\in\mathcal{IC}_{m}$ ; and hence $\mathcal{IC}_{m}$ is nonempty. Moreover, note that there must exist $d_{m}\in[0,d_{L})$ such that ${\mathbb{E}}[(X-d_{m})_{+}]=m$ . For any $I\in\mathcal{IC}_{m}$ , define

\tilde{I}_{\alpha}(x):=\alpha I(x)+(1-\alpha)(x-d_{m})_{+},\;\alpha\in[0,1].

Then, we have ${\mathbb{E}}[\tilde{I}_{\alpha}(X)]=m$ and

var[\tilde{I}_{1}(X)]=var[I(X)]\leqslant\nu=var[(X-d_{L})_{+}]\leqslant var[(X-d_{m})_{+}]=var[\tilde{I}_{0}(X)],

where the last inequality follows from the fact that $var[(X-d)_{+}]$ is decreasing in $d$ . Because $var[\tilde{I}_{\alpha}(X)]$ is continuous in $\alpha$ , there must exist $\alpha^{*}\in[0,1]$ such that $var[\tilde{I}_{\alpha^{*}}(X)]=\nu$ . Lemma A.2 yields that

X\wedge d_{m}\leqslant_{cx}X-I(X),

leading to

{\mathbb{E}}[U(W_{\tilde{I}_{\alpha^{*}}}(X))]\geqslant\alpha^{*}{\mathbb{E}}[U(W_{I}(X))]+(1-\alpha^{*}){\mathbb{E}}[U(W_{(X-d_{m})_{+}}(X))]\geqslant{\mathbb{E}}[U(W_{I}(X))].

This means that $I$ is no better than $\tilde{I}_{\alpha^{*}}$ . Together with the Arzéla-Ascoli theorem, the above analysis implies that there exists an optimal solution to Problem (C.3) that binds the variance constraint. Finally, a similar argument to the proof of Proposition 3.1 in Chi and Wei (2020) further shows that the optimal solution to Problem (C.3) must be unique.

By defining $U^{*}(\lambda,\beta):=U_{\lambda,\beta}(I_{\lambda,\beta})$ and for any $\alpha\in[0,1]$ , we have

$\displaystyle U^{*}\big{(}\alpha\lambda_{1}+(1-\alpha)\lambda_{2},\alpha\beta_{1}+(1-\alpha)\beta_{2}\big{)}$	$\displaystyle=$	$\displaystyle\max\limits_{{I\in\mathcal{IC}}}U_{\alpha\lambda_{1}+(1-\alpha)\lambda_{2},\alpha\beta_{1}+(1-\alpha)\beta_{2}}(I)$
	$\displaystyle=$	$\displaystyle\max\limits_{{I\in\mathcal{IC}}}\big{\{}\alpha U_{\lambda_{1},\beta_{1}}(I)+(1-\alpha)U_{\lambda_{2},\beta_{2}}(I)\big{\}}$
	$\displaystyle\leqslant$	$\displaystyle\max\limits_{{I\in\mathcal{IC}}}\big{\{}\alpha U_{\lambda_{1},\beta_{1}}(I)\}+\max\limits_{{I\in\mathcal{IC}}}\{(1-\alpha)U_{\lambda_{2},\beta_{2}}(I)\big{\}}$
	$\displaystyle=$	$\displaystyle\alpha\max\limits_{{I\in\mathcal{IC}}}\big{\{}U_{\lambda_{1},\beta_{1}}(I)\big{\}}+(1-\alpha)\max\limits_{{I\in\mathcal{IC}}}\big{\{}U_{\lambda_{2},\beta_{2}}(I)\big{\}}$
	$\displaystyle=$	$\displaystyle\alpha U^{}(\lambda_{1},\beta_{1})+(1-\alpha)U^{}(\lambda_{2},\beta_{2}),$

where the second equality is due to the fact that $U_{\lambda,\beta}(I)$ is linear in $(\lambda,\beta)$ for any given $I\in\mathcal{IC}$ . Thus, $U^{*}(\lambda,\beta)$ is convex in $(\lambda,\beta)$ .

Furthermore, denoting by $U^{**}$ the maximal EU value of the insured’s final wealth in Problem (C.3), we have $U^{**}\leqslant U(w_{0}-(1+\rho)m)<\infty$ . On the other hand, for any given $I\in\mathcal{IC}$ satisfying ${\mathbb{E}}[I(X)]\neq m$ or $var[I(X)]>\nu$ , it is easy to show that $\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)=-\infty$ . Noting that

U_{0,0}(I)={\mathbb{E}}[U(w_{0}-X+I(X)-(1+\rho)m)],

we have

\max\limits_{{I\in\mathcal{IC}}}\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)\leqslant U^{**}.

Now,

\max_{I\in\mathcal{IC}}U_{\lambda,\beta}(I)\geqslant\max_{I\in\mathcal{IC}_{m}}U_{\lambda,\beta}(I)\geqslant\max_{I\in\mathcal{IC}_{m}}{\mathbb{E}}[U(w_{0}-X+I(X)-(1+\rho)m)],\;\forall\lambda\in{\mathbb{R}},\;\beta\geqslant 0,

leading to $U^{**}\leqslant\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)$ . Since $U_{\lambda,\beta}(I)$ is continuous and strictly concave in $I$ , we can obtain from Lemmas B.1 and B.2 that

\min\limits_{{I\in\mathcal{IC}}}\max\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}-U_{\lambda,\beta}(I)=\max\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\min\limits_{{I\in\mathcal{IC}}}-U_{\lambda,\beta}(I),

which implies

\max\limits_{{I\in\mathcal{IC}}}\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)=\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)=U^{**}.

By denoting $\lambda_{U}:=(U^{**}+1)/m$ , we have

U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(0)\geqslant\lambda_{U}m=U^{**}+1,\;\;\forall\lambda\geqslant\lambda_{U},\;\beta\geqslant 0.

Furthermore, we define $\lambda_{L}:=-(U^{**}+1)/\left({\mathbb{E}}[X\wedge K_{1}]-m\right)$ , where $K_{1}$ is determined by ${\mathbb{E}}\left[(X\wedge K_{1})^{2}\right]=\nu+m^{2}<{\mathbb{E}}\left[(X\wedge K_{U})^{2}\right]$ . Clearly, $K_{1}\in(0,K_{U})$ . If ${\mathbb{E}}[X\wedge K_{1}]\leqslant m$ , then $var[X\wedge K_{1}]\geqslant\nu=var[X\wedge K_{U}]$ , which contradicts the definition of $K_{U}$ . Thus, we must have ${\mathbb{E}}[X\wedge K_{1}]>m$ , and hence

U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(x\wedge K_{1})\geqslant-\lambda_{L}\left({\mathbb{E}}[X\wedge K_{1}]-m\right)\geqslant U^{**}+1,\;\;\forall\lambda\leqslant\lambda_{L},\;\beta\geqslant 0.

Similarly, let $\beta_{U}:=\frac{U^{**}+1}{\nu-var[X\wedge K_{2}]}$ , where $K_{2}$ is determined by ${\mathbb{E}}[X\wedge K_{2}]=m$ . Here, we have $K_{2}<K_{U}$ , which in turn implies $var[X\wedge K_{2}]<\nu$ based on the definition of $K_{U}$ . For any $\beta\geqslant\beta_{U}$ and $\lambda\in{\mathbb{R}}$ , we obtain

U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(x\wedge K_{2})\geqslant\beta_{U}(\nu-var[X\wedge K_{2}])=U^{**}+1.

The above analysis indicates that

U^{**}=\min_{\lambda_{L}\leqslant\lambda\leqslant\lambda_{U}\atop 0\leqslant\beta\leqslant\beta_{U}}U^{*}(\lambda,\beta).

Thus, it follows from the convexity of $U^{*}(\lambda,\beta)$ and Weierstrass’s theorem that there exist $\lambda^{*}_{m}\in[\lambda_{L},\lambda_{U}]$ and $\beta^{*}_{m}\in[0,\beta_{U}]$ such that

U^{**}=U^{*}(\lambda^{*}_{m},\beta^{*}_{m})=\min_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U^{*}(\lambda,\beta).

Moreover, we have

\displaystyle U^{*}(\lambda^{*}_{m},\beta^{*}_{m})=\max_{I\in\mathcal{IC}}U_{\lambda^{*}_{m},\beta^{*}_{m}}(I)\geqslant\max_{I\in\mathcal{IC}_{m},var[I(X)]=\nu}U_{\lambda^{*}_{m},\beta^{*}_{m}}(I)=U^{**},

(C.4)

where the second equality is derived from the fact that the optimal solution to Problem (C.3) binds the variance constraint. In addition, thanks to (C.4), the unique optimal solution $I^{*}_{m}$ of Problem (C.3) must solve Problem (C.1) with $\lambda=\lambda^{*}_{m}$ and $\beta=\beta^{*}_{m}$ . Note that $U_{\lambda^{*}_{m},\beta^{*}_{m}}(I)$ is strictly concave in $I$ ; therefore, $I^{*}_{m}(X)=I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)$ almost surely. As a result, $I_{\lambda^{*}_{m},\beta^{*}_{m}}$ satisfies (C.2) and must be a solution to Problem (C.3).

Finally, we show that $\beta^{*}_{m}>0$ . Otherwise, if $\beta^{*}_{m}=0$ , then $I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=(x-d_{m})_{+}$ , yielding a contradiction

\nu=var[(X-d_{m})_{+}]>var[(X-d_{L})_{+}]=\nu.

The proof is thus complete.

Proof of Proposition 3.7:

Denote

L_{\beta}(I):={\mathbb{E}}\big{[}U\big{(}w_{0}-X+I(X)-(1+\rho){\mathbb{E}}[I(X)]\big{)}\big{]}-\beta\big{(}var[I(X)]-\nu\big{)},\;\beta\geqslant 0,\;I\in\mathcal{IC},

which is linear in $\beta$ and concave in $I$ , because $U^{\prime\prime}(\cdot)<0$ and $var[I(X)]$ is convex in $I$ .

We denote by $U^{***}$ the maximum EU value of the insured’s final wealth in Problem (3.11). Using an argument similar to that in the proof of Proposition 3.5, we have

\max_{I\in\mathcal{IC}}\min_{\beta\geqslant 0}L_{\beta}(I)\leqslant U^{***}\leqslant\min_{\beta\geqslant 0}\max_{I\in\mathcal{IC}}L_{\beta}(I),

which, together with Lemma B.2, implies

U^{***}=\min_{\beta\geqslant 0}\max_{I\in\mathcal{IC}}L_{\beta}(I).

Denoting $\tilde{\beta}=\frac{U^{***}+1}{\nu}$ , we have

\max_{I\in\mathcal{IC}}L_{\beta}(I)\geqslant L_{\beta}(0)\geqslant\beta\nu\geqslant U^{***}+1,\;\;\forall\beta\geqslant\tilde{\beta}.

Furthermore, since $\max_{I\in\mathcal{IC}}L_{\beta}(I)$ is convex in $\beta$ , there must exist $\beta^{*}\in[0,\tilde{\beta}]$ such that

U^{***}=\max_{I\in\mathcal{IC}}L_{\beta^{*}}(I).

If $\beta^{*}=0$ , then

U^{***}=\max_{I\in\mathcal{IC}}{\mathbb{E}}[U(W_{I}(X))],

in which case the stop-loss insurance $(x-d^{*})_{+}$ is optimal, where $d^{*}$ is defined in (3.2). However, this is contradicted by Assumption 3.1. So we must have $\beta^{*}>0$ .

According to Lemma 3.2, Lemma 3.3 and Proposition 3.5, $I^{*}$ , which solves Problem (2.1) under Assumption 3.1, satisfies $var[I^{*}(X)]=\nu$ and therefore also solves the maximization problem

\max_{I\in\mathcal{IC}}L_{\beta^{*}}(I).

Thus, for any $I\in\mathcal{IC}$ , we have

\lim_{\alpha\downarrow 0}\frac{L_{\beta^{*}}\left((1-\alpha)I^{*}(x)+\alpha I(x)\right)-L_{\beta^{*}}(I^{*}(x))}{\alpha}\leqslant 0,

leading to

	$\displaystyle 0\geqslant$	$\displaystyle{\mathbb{E}}\Big{[}U^{\prime}(W_{I^{}}(X))\big{(}I(X)-I^{}(X)-(1+\rho){\mathbb{E}}[I(X)-I^{}(X)]\big{)}\Big{]}-2\beta^{}\big{(}cov(I^{}(X),I(X))-var[I^{}(X)]\big{)}$
	$\displaystyle=$	$\displaystyle\int_{0}^{\infty}\Big{(}{\mathbb{E}}\big{[}U^{\prime}(W_{I^{*}}(X))(\mathbb{I}_{\{X>t\}}-(1+\rho){\mathbb{P}}\{X>t\})\big{]}$
		$\displaystyle\ \ \ \ \ \ \ -2\beta^{}\big{(}{\mathbb{E}}[I^{}(X)\mathbb{I}_{\{X>t\}}]-{\mathbb{E}}[I^{}(X)]{\mathbb{P}}\{X>t\}\big{)}\Big{)}\big{(}I^{\prime}(t)-{I^{}}^{\prime}(t)\big{)}{\mathrm{d}}t$
	$\displaystyle=$	$\displaystyle\int_{0}^{\mathcal{M}}\Big{(}{\mathbb{E}}\big{[}U^{\prime}(W_{I^{}}(X))-2\beta^{}I^{*}(X)\|X>t\big{]}$
		$\displaystyle\ \ \ \ \ \ \ -(1+\rho){\mathbb{E}}\big{[}U^{\prime}(W_{I^{}}(X))]+2\beta^{}{\mathbb{E}}[I^{}(X)\big{]}\Big{)}{\mathbb{P}}\{X>t\}\big{(}I^{\prime}(t)-{I^{}}^{\prime}(t)\big{)}{\mathrm{d}}t$
	$\displaystyle=$	$\displaystyle\int_{0}^{\mathcal{M}}\Phi_{I^{}}(t){\mathbb{P}}\{X>t\}\big{(}I^{\prime}(t)-{I^{}}^{\prime}(t)\big{)}{\mathrm{d}}t,$

where $\Phi_{I^{*}}$ is defined in (3.14) and $\mathbb{I}_{A}$ is the indicator function of an event $A$ . The arbitrariness of $I\in\mathcal{IC}$ and the fact that ${\mathbb{P}}\{X>t\}>0$ for any $t<\mathcal{M}$ yield that ${I^{*}}^{\prime}$ should be in the form of (3.13). The proof is complete.

Proof of Theorem 3.8:

Let $I^{*}$ be optimal for Problem (2.1). Then it follows from Proposition 3.7 that

\displaystyle\Phi_{I^{*}}(x)={\mathbb{E}}\big{[}\psi(X)|X>x\big{]}

(C.5)

where

\psi(x):=U^{\prime}(W_{I^{*}}(x))-2\beta^{*}I^{*}(x)-\Big{(}(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]-2\beta^{*}{\mathbb{E}}[I^{*}(X)]\Big{)},

(C.6)

for some $\beta^{*}>0$ .

Since $F_{X}$ is assumed to be strictly increasing on $(0,\mathcal{M})$ , it follows from Proposition 3.4 that $m_{L}<m_{U}$ . Recall that solving Problem (2.1) can be reduced to solving Problem (3.10) under Assumption 3.1. In the following, we proceed to solve Problem (3.10) with the help of Proposition 3.7. We carry out the analysis for three cases:

Case (A)

If $I^{*}(x)=(x-d_{L})_{+}$ , then $\psi$ is strictly increasing on $[0,d_{L}]$ and strictly decreasing on $[d_{L},\mathcal{M})$ . If there exists $\tilde{x}\in[d_{L},\mathcal{M})$ such that $\psi(\tilde{x})<0$ , then $\psi(x)<0$ and thus $\Phi_{I^{*}}(x)<0$ $\forall x\in[\tilde{x},\mathcal{M}),$ which contradicts Proposition 3.7. Therefore, $\psi(x)\geqslant 0$ $\forall x\in[d_{L},\mathcal{M})$ and $\psi(d_{L})>0$ . Because $\psi(x)$ is continuous in $x$ and $F_{X}$ is strictly increasing on $(0,\mathcal{M})$ , there must exist $\epsilon>0$ such that $\Phi_{I^{*}}(x)>0$ $\forall x\in[d_{L}-\epsilon,d_{L})$ , contradicting Proposition 3.7. So $(x-d_{L})_{+}$ cannot be an optimal solution to Problem (3.10).
Case (B)

If $I^{*}(x)=x\wedge K_{U}$ , then $\psi$ is strictly decreasing on $[0,K_{U}]$ and strictly increasing on $[K_{U},\mathcal{M})$ . Using a similar argument as the one for Case (A), we deduce $\psi(x)\leqslant 0$ $\forall x\in[K_{U},\mathcal{M})$ and $\psi(K_{U})<0$ . Since $\psi$ is a continuous function, there exists $\epsilon>0$ such that $\Phi_{I^{*}}(x)<0$ $\forall x\in[K_{U}-\epsilon,K_{U})$ , contradicting Proposition 3.7. As a result, $x\wedge K_{U}$ cannot be optimal to Problem (3.10) either.

Case (C)

Hence, the optimal solution must be of the form $I^{*}(x)=I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)$ for some $m\in(m_{L},m_{U})$ . Noting that $I_{\lambda^{*}_{m},\beta^{*}_{m}}^{\prime}(x)\in(0,1)$ for sufficiently large $x$ due to $\beta^{*}_{m}>0$ and (3.8), we deduce from Proposition 3.7 that $\Phi_{I^{*}}(x)=0$ for sufficiently large $x$ . Thus, together with (3.7), (C.6) and (C.5), Proposition 3.7 further implies that

\beta^{*}_{m}=\beta^{*}\qquad\text{and}\qquad\lambda^{*}_{m}=(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]-2\beta^{*}{\mathbb{E}}[I^{*}(X)].

(C.7)

Next, we consider two subcases that depend on the values of $\lambda^{*}_{m}$ and $U^{\prime}(w_{0}-(1+\rho)m)$ .

(C.1)

If $\lambda^{*}_{m}<U^{\prime}(w_{0}-(1+\rho)m)$ , then $I^{*}(x)=x$ $\forall 0\leqslant x\leqslant\hat{x}$ and

U^{\prime}(w_{0}-x+I^{*}(x)-(1+\rho)m)-2\beta^{*}I^{*}(x)-\lambda^{*}_{m}\left\{\begin{array}[]{ll}=0,&x\geqslant\hat{x},\\ >0,&x<\hat{x},\end{array}\right.

where $\hat{x}:=\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda^{*}_{m}}{2\beta^{*}}>0$ . Therefore, it follows that

$\displaystyle 0$	$\displaystyle<$	$\displaystyle{\mathbb{E}}[U^{\prime}(w_{0}-X+I^{}(X)-(1+\rho)m)-2\beta^{}I^{}(X)-\lambda^{}_{m}]$
	$\displaystyle=$	$\displaystyle{\mathbb{E}}[U^{\prime}(W_{I^{}}(X))]-2\beta^{}{\mathbb{E}}[I^{}(X)]-(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{}}(X))]+2\beta^{}{\mathbb{E}}[I^{}(X)]$
	$\displaystyle=$	$\displaystyle-\rho{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))],$

leading to a contradiction.

(C.2)

Consequently, we must have $\lambda^{*}_{m}\geqslant U^{\prime}(w_{0}-(1+\rho)m)$ , in which case $I^{*}$ is coinsurance above a deductible (i.e., (3.5)). Similarly to Subcase Case (C)(C.1), we can show that

U^{\prime}(w_{0}-x+I^{*}(x)-(1+\rho)m)-2\beta^{*}I^{*}(x)-\lambda^{*}_{m}\left\{\begin{array}[]{ll}=0,&x\geqslant\tilde{d},\\ <0,&x<\tilde{d},\end{array}\right.

where $\tilde{d}:=w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda_{m}^{*})\geqslant 0$ . This, together with (C.7), yields

\lambda^{*}_{m}=U^{\prime}(w_{0}-\tilde{d}-(1+\rho)m)

(C.8)

and

-\rho{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]={\mathbb{E}}\Big{[}\big{(}U^{\prime}(w_{0}-X+I^{*}(X)-(1+\rho)m)-2\beta^{*}I^{*}(X)-\lambda^{*}_{m}\big{)}\mathbb{I}_{\{X<\tilde{d}\}}\Big{]}.

(C.9)

Therefore, $\rho=0$ if and only if $\tilde{d}=0$ . Moreover, if $\rho=0$ , the above analysis indicates that $I^{*}$ solves the equation (3.15). Otherwise, if $\rho>0$ , then it follows from (C.7) and (C.9) that

{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]=\frac{{\mathbb{E}}[U^{\prime}(w_{0}-X-(1+\rho)m){\mathbb{I}}_{\{X\leqslant\tilde{d}\}}]+2\beta^{*}mF_{X}(\tilde{d})}{1-(1+\rho){\mathbb{P}}\{X>\tilde{d}\}},

which in turn implies $\tilde{d}>VaR_{\frac{1}{1+\rho}}(X)$ . Plugging the above equation and (C.8) into (C.7) yields

2\beta^{*}=\frac{1}{m\rho}\left(U^{\prime}(w_{0}-\tilde{d}-(1+\rho)m)-(1+\rho){\mathbb{E}}[U^{\prime}(w_{0}-X\wedge\tilde{d}-(1+\rho)m)]\right).

As a result, the optimal solution $I^{*}$ must be given by (3.17). The proof is complete.

Proof of Corollary 3.9:

We prove for the case in which $\rho>0$ , but note that the proof to the case of $\rho=0$ is similar and indeed simpler. It follows from (3.5) and (3.8) that $I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=0$ for $x\leqslant\tilde{d}$ and $I^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)$ increases in $x$ for $x>\tilde{d}$ , where we use the fact that $x-I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)$ increases in $x$ , and $\beta^{*}_{m}>0$ . Moreover, for $x>\tilde{d}$ , taking the derivative of $\frac{I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x}$ with respect to $x$ yields

\Big{(}\frac{I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x}\Big{)}^{\prime}=\Big{(}\frac{f_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x}\Big{)}^{\prime}=\frac{f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)x-f_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x^{2}}\geqslant\frac{\int_{\tilde{d}}^{x}\big{(}f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)-f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(y)\big{)}{\mathrm{d}}y}{x^{2}}\geqslant 0.

This completes the proof.

Proof of Theorem 4.1:

We first prove the result assuming that there exists $x_{0}\in\left(0,\mathcal{M}\right]$ such that

y_{0}:=e_{I_{1}^{*}}(x_{0})=e_{I_{2}^{*}}(x_{0})\quad\text{and}\quad\phi(x_{0}-y_{0})>\frac{\beta_{1}^{*}}{\beta_{2}^{*}},

(C.10)

where $\phi$ is defined in (B.1). First, we show that $e_{I_{1}^{*}}(x)$ up-crosses $e_{I_{2}^{*}}(x)$ in a neighbour of $x_{0}$ . To this end, we first note

e_{I_{1}^{*}}^{\prime}(x_{0})-e_{I_{2}^{*}}^{\prime}(x_{0})=\frac{1}{1+\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{1}-x_{0}+y_{0})}}-\frac{1}{1+\frac{2\beta_{2}^{*}}{-U^{\prime\prime}(w_{2}-x_{0}+y_{0})}}>0,

which in turn implies that there exists an $\epsilon>0$ such that

\left\{\begin{array}[]{ll}e_{I_{1}^{*}}(x)<e_{I_{2}^{*}}(x),&x\in(x_{0}-\epsilon,x_{0}),\\ e_{I_{1}^{*}}(x)>e_{I_{2}^{*}}(x),&x\in(x_{0},x_{0}+\epsilon).\end{array}\right.

Next, we show that there exists no $y>x_{0}$ such that $e_{I_{1}^{*}}(y)=e_{I_{2}^{*}}(y)$ . Otherwise, if such $y$ existed, then the increasing property of $\phi(z)$ in $z$ , together with the strictly increasing property of $x-e_{I_{i}^{*}}(x)$ in $x$ , would imply

\phi(y-e_{I_{1}^{*}}(y))=\phi(y-e_{I_{2}^{*}}(y))>\phi(x_{0}-e_{I_{2}^{*}}(x_{0}))>\frac{\beta_{1}^{*}}{\beta_{2}^{*}},

which would in turn yield that $e_{I_{1}^{*}}$ up-crosses $e_{I_{2}^{*}}$ in a neighbour of $y$ . This, however, contradicts the fact that $e_{I_{1}^{*}}(x)>e_{I_{2}^{*}}(x)$ for $x\in(x_{0},x_{0}+\epsilon)$ .

Now define

x_{1}:=\sup\left\{x\in\left[0,x_{0}\right):e_{I_{2}^{*}}(x)<e_{I_{1}^{*}}(x)\right\}.

If $x_{1}=0$ , then $e_{I_{1}^{*}}$ up-crosses $e_{I_{2}^{*}}$ , which contradicts Lemma B.4. Thus, we must have $0<x_{1}<x_{0}$ because $e_{I_{1}^{*}}(x)<e_{I_{2}^{*}}(x)$ for all $x\in(x_{0}-\epsilon,x_{0})$ . Moreover, it follows readily that

\phi\left(x_{1}-e_{I_{1}^{*}}(x_{1})\right)\leqslant\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.

In the following, we show that there exists no point $x_{2}\in\left(0,x_{1}\right)$ such that $e_{I_{1}^{*}}(x_{2})=e_{I_{2}^{*}}(x_{2})$ . Indeed, if such $x_{2}$ existed, then, noting that $\phi$ is strictly increasing, we would have

\phi\left(x_{2}-e_{I_{1}^{*}}(x_{2})\right)<\frac{\beta_{1}^{*}}{\beta_{2}^{*}},

leading to $e_{I_{1}^{*}}^{\prime}(x_{2})-e_{I_{2}^{*}}^{\prime}(x_{2})<0.$ In other words, $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ at $x_{2}$ . This contradicts the fact that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ at $x_{1}$ . Therefore, we can now conclude that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ twice when (C.10) is satisfied.

Let us now consider the case in which (C.10) is not satisfied. We study two cases:

Case (A)

If there exists no $x\in\left(0,\mathcal{M}\right]$ such that $e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)$ , then it is easy to show from ${\mathbb{E}}[e_{I_{i}^{*}}(X)]=0$ that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ have the same distribution. This contradicts Lemma B.4.
Case (B)
Otherwise, any $x\in\left(0,\mathcal{M}\right]$ satisfying $e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)$ must have

$\phi(x-e_{I_{i}^{*}}(x))\leqslant\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.$

If the above inequality is always strict, then the previous analysis shows that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ , which contradicts Lemma B.4. Hence, there must exist $x_{3}\in\left(0,\mathcal{M}\right]$ such that

$y_{3}:=e_{I_{1}^{*}}(x_{3})=e_{I_{2}^{*}}(x_{3})\quad\text{and}\quad\phi(x_{3}-y_{3})=\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.$

In this case, we further divide our analysis into three subcases.
1. (B.1)
  
  If $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ at $x_{3}$ , then the above analysis would imply that no up-crossing occurs before $x_{3}$ . A similar argument indicates that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ would have the same distribution, which would not be possible.
2. (B.2)
  
  Otherwise, if $e_{I_{1}^{*}}$ up-crosses $e_{I_{2}^{*}}$ at $x_{3}$ , then the previous analysis shows that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ twice.
3. (B.3)
  
  Finally, if no up-crossing happens at $x_{3}$ , then we can simply neglect this single point $x_{3}$ in the analysis. If there further exists $x_{4}\in(0,x_{3})$ satisfying $e_{I_{1}^{*}}(x_{4})=e_{I_{2}^{*}}(x_{4})$ , then we have
  
  $\phi(x_{4}-e_{I_{i}^{*}}(x_{4}))<\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.$
  
  In this case, the previous analysis indicates that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ at $x_{4}$ , which contradicts Lemma B.4. Otherwise, if there exists no $x\in\left(0,x_{3}\right)$ such that $e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)$ , then it follows from ${\mathbb{E}}[e_{I_{i}^{*}}(X)]=0$ that $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ have the same distribution. This again contradicts Lemma B.4.

In summary, we have shown that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ twice. Because $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ have the same first two moments, we can easily see that the insurer’s profit with the wealthier insured, $-e_{I_{2}^{*}}(X)$ , has less downside risk than the counterpart with the less wealthy insured, $-e_{I_{1}^{*}}(X)$ , when the insurance pricing is actuarially fair. The proof is complete.

Proof of Corollary 4.2:

(i) It follows from the proof of Theorem 4.1 that $e_{I_{2}^{*}}(0)<e_{I_{1}^{*}}(0),$ which is equivalent to ${\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]$ . Suppose that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ at points $x_{0}$ and $x_{1}$ with $0<x_{0}<x_{1}$ . Then $e_{I_{1}^{*}}(x_{j})=e_{I_{2}^{*}}(x_{j})$ for $j=0,1$ , which, together with (4.2), implies

	$\displaystyle U^{\prime}(w_{1}-x_{0}+e_{I_{1}^{}}(x_{0}))-U^{\prime}(w_{2}-x_{0}+e_{I_{1}^{}}(x_{0}))-2(\beta_{1}^{}-\beta_{2}^{})e_{I_{1}^{*}}(x_{0})$
	$\displaystyle=U^{\prime}(w_{1}-x_{1}+e_{I_{1}^{}}(x_{1}))-U^{\prime}(w_{2}-x_{1}+e_{I_{1}^{}}(x_{1}))-2(\beta_{1}^{}-\beta_{2}^{})e_{I_{1}^{*}}(x_{1}).$		(C.11)

Denote

L(y):=U^{\prime}(w_{1}-y)-U^{\prime}(w_{2}-y),\;\;y\in[0,w_{1}).

Then

L^{\prime}(y)=-U^{\prime\prime}(w_{1}-y)+U^{\prime\prime}(w_{2}-y)>0

due to $w_{1}<w_{2}$ and $U^{\prime\prime\prime}>0$ . Recalling that $x-e_{I_{i}^{*}}(x)$ and $e_{I_{i}^{*}}(x)$ are strictly increasing in $x$ , we deduce from (C.11) that $\beta_{2}^{*}<\beta_{1}^{*}.$

(ii) Let us denote $\widetilde{w}_{i}:=w_{i}-{\mathbb{E}}[I^{*}_{i}(X)]$ , $i=1,2.$ The following analysis depends on the comparison between $\widetilde{w}_{1}$ and $\widetilde{w}_{2}$ .

Case (A)

If $\widetilde{w}_{1}=\widetilde{w}_{2}$ and there exists $z\in[0,\mathcal{M}]$ such that $I^{*}_{1}(z)=I^{*}_{2}(z)$ , then it follows from (3.15) that $2(\beta^{*}_{1}-\beta^{*}_{2})I^{*}_{i}(z)=0,$ $i=1,2$ . Recalling that $\beta_{2}^{*}<\beta_{1}^{*},$ we have $I^{*}_{i}(z)=0$ , which implies $z=0$ . Because ${\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]$ , we conclude that $I^{*}_{1}(x)<I^{*}_{2}(x)$ $\forall x\in(0,\mathcal{M}]$ .

Case (B)

Otherwise, if $\widetilde{w}_{1}\neq\widetilde{w}_{2}$ , we can show that either $I^{*}_{1}(x)<I^{*}_{2}(x)$ $\forall x>0$ or $I^{*}_{1}$ up-crosses $I^{*}_{2}$ . Indeed, if there exists $z\in(0,\mathcal{M}]$ such that $I^{*}_{1}(z)=I^{*}_{2}(z)$ , then we have

	$\displaystyle U^{\prime}(\widetilde{w}_{1}-z+I^{}_{1}(z))-U^{\prime}(\widetilde{w}_{1})-(U^{\prime}(\widetilde{w}_{2}-z+I^{}_{2}(z))-U^{\prime}(\widetilde{w}_{2}))$
	$\displaystyle=2(\beta_{1}^{}-\beta_{2}^{})I^{*}_{2}(z)>0.$

Noting that $U^{\prime}(\widetilde{w}-y)-U^{\prime}(\widetilde{w})$ is strictly decreasing in $\widetilde{w}$ for any $y>0$ because of $U^{\prime\prime\prime}>0$ , we must have $\widetilde{w}_{1}<\widetilde{w}_{2}.$ Furthermore, we have

{I_{i}^{*}}^{\prime}(z)=\frac{1}{1+\frac{2\beta_{i}^{*}}{-U^{\prime\prime}(\widetilde{w}_{i}-z+I_{i}^{*}(z))}}=\frac{1}{1+\frac{U^{\prime}(\widetilde{w}_{i}-z+I^{*}_{i}(z))-U^{\prime}(\widetilde{w}_{i})}{-U^{\prime\prime}(\widetilde{w}_{i}-z+I_{i}^{*}(z))\times I_{i}^{*}(z)}},\;\;i=1,2.

Denoting

H(w):=\frac{U^{\prime}(w-y)-U^{\prime}(w)}{-U^{\prime\prime}(w-y)},\;\;y\in[0,w),

we obtain

	$\displaystyle H^{\prime}(w)$	$\displaystyle=$	$\displaystyle\frac{U^{\prime}(w-y)-U^{\prime}(w)}{-U^{\prime\prime}(w-y)}\Big{[}\mathcal{P}_{U}(w-y)+\frac{U^{\prime\prime}(w-y)-U^{\prime\prime}(w)}{U^{\prime}(w-y)-U^{\prime}(w)}\Big{]}$
		$\displaystyle=$	$\displaystyle H(w)\Big{[}\mathcal{P}_{U}(w-y)-\mathcal{P}_{U}(w-\theta y)\Big{]}>0,$

where the second equality is due to the mean-value theorem with $\theta\in(0,1)$ and the last inequality is due to the assumption of strict DAP. The strictly increasing property of $H(w)$ , together with the fact that $\widetilde{w}_{1}<\widetilde{w}_{2}$ , yields ${I_{2}^{*}}^{\prime}(z)<{I_{1}^{*}}^{\prime}(z),$ which, in turn, implies that $I^{*}_{1}$ up-crosses $I^{*}_{2}$ at point $z$ .

Otherwise, if $I^{*}_{1}(x)\neq I^{*}_{2}(x)$ $\forall x\in(0,\mathcal{M})$ , then it follows from ${\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]$ that $I^{*}_{1}(x)<I^{*}_{2}(x)$ $\forall x>0$ . The proof is complete.

Proof of Theorem 4.3:

If there exists no point $x\in(0,\mathcal{M})$ such that $e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)$ , then $e_{I_{1}^{*}}(X)$ and $e_{I_{2}^{*}}(X)$ are equal in distribution due to ${\mathbb{E}}[e_{I_{1}^{*}}(X)]={\mathbb{E}}[e_{I_{2}^{*}}(X)]=0$ . This contradicts the fact that $var[e_{I_{1}^{*}}(X)]=\nu_{1}<\nu_{2}=var[e_{I_{2}^{*}}(X)]$ . Therefore, we must have $e_{I_{1}^{*}}(z)=e_{I_{2}^{*}}(z)$ for some $z\in(0,\mathcal{M})$ ; and thus (4.4) implies

{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))]-{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]=2(\beta^{*}_{1}-\beta^{*}_{2})e_{I_{1}^{*}}(z).

We divide the following proof into two cases by comparing ${\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]$ with ${\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))]$ .

Case (A)

If ${\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]\neq{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))]$ , then $\beta^{*}_{1}\neq\beta^{*}_{2}$ . Since $e_{I_{1}^{*}}$ is a strictly increasing function, we have $\mathcal{X}:=\{x\in[0,\mathcal{M}]:e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)\}=\{z\}$ . Recalling that $var[e_{I_{1}^{*}}(X)]<var[e_{I_{2}^{*}}(X)]$ , we conclude from Lemma A.3 that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}.$

Case (B)

If ${\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]={\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))]$ , we have

\displaystyle U^{\prime}(w_{0}-x+e_{I_{1}^{*}}(x))-2\beta_{1}^{*}e_{I_{1}^{*}}(x)=U^{\prime}(w_{0}-x+e_{I_{2}^{*}}(x))-2\beta_{2}^{*}e_{I_{2}^{*}}(x).

(C.12)

In this case, we further divide our analysis into two subcases based on the comparison between $\beta^{*}_{1}$ and $\beta^{*}_{2}$ .

(B.1)

If $\beta^{*}_{1}\neq\beta^{*}_{2}$ , then it follows from (C.12) that $e_{I_{1}^{*}}(x)=0$ $\forall x\in\mathcal{X}$ . Similar to Case (A), we can show that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}.$

(B.2)

If $\beta^{*}_{1}=\beta^{*}_{2}$ , then (C.12) can be rewritten as

\displaystyle U^{\prime}(w_{0}-x+e_{I_{1}^{*}}(x))+2\beta_{1}^{*}(x-e_{I_{1}^{*}}(x))=U^{\prime}(w_{0}-x+e_{I_{2}^{*}}(x))+2\beta_{1}^{*}(x-e_{I_{2}^{*}}(x)).

Note that $U^{\prime}(w_{0}-y)+2\beta_{1}^{*}y$ is strictly increasing in $y$ . Hence, the above equation yields $x-e_{I_{1}^{*}}(x)=x-e_{I_{2}^{*}}(x)$ for all $x\in[0,\mathcal{M}]$ , which contradicts the fact that $var[e_{I_{1}^{*}}(X)]<var[e_{I_{2}^{*}}(X)]$ .

In summary, we conclude that $e_{I_{2}^{*}}$ up-crosses $e_{I_{1}^{*}}$ .

Proof of Corollary 4.4:

(i) From the proof of Theorem 4.3, we know that

e_{I_{2}^{*}}(x)-e_{I_{1}^{*}}(x)\left\{\begin{array}[]{ll}<0,&x<z;\\ >0,&x\in(z,\mathcal{M}]\end{array}\right.

(C.13)

for some $z\in(0,\mathcal{M})$ . Therefore, we have

{\mathbb{E}}[I^{*}_{1}(X)]=-e_{I_{1}^{*}}(0)<-e_{I_{2}^{*}}(0)={\mathbb{E}}[I^{*}_{2}(X)].

Furthermore, $e_{I_{1}^{*}}^{\prime}(z)-e_{I_{2}^{*}}^{\prime}(z)\leqslant 0.$ Recalling that

e_{I_{1}^{*}}^{\prime}(x)=\frac{1}{1+\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{0}-x+e_{I_{1}^{*}}(x))}},

we have $\beta_{2}^{*}\leqslant\beta_{1}^{*}.$ As $\beta_{1}^{*}=\beta_{2}^{*}$ does not hold, as shown in the proof of Theorem 4.3, we obtain $\beta_{2}^{*}<\beta_{1}^{*}$ .

(ii) On the one hand, in view of (C.13), it follows from ${\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]$ that $I_{2}^{*}(x)>I_{1}^{*}(x)$ $\forall x\geqslant z$ . On the other hand, for any $x\in[0,z),$ noting that $e_{I_{2}^{*}}(x)<e_{I_{1}^{*}}(x)$ , we deduce from $U^{\prime\prime\prime}\geqslant 0$ that

-U^{\prime\prime}(w_{0}-x+e_{I_{1}^{*}}(x))\leqslant-U^{\prime\prime}(w_{0}-x+e_{I_{2}^{*}}(x)).

Because

\beta_{2}^{*}<\beta^{*}_{1}\quad\text{and}\quad e_{I_{i}^{*}}^{\prime}(x)=\frac{1}{1+\frac{2\beta_{i}^{*}}{-U^{\prime\prime}(w_{0}-x+e_{I_{i}^{*}}(x))}},\;\;i=1,2,

we have $e_{I_{1}^{*}}^{\prime}(x)<e_{I_{2}^{*}}^{\prime}(x),$ which is equivalent to ${I_{1}^{*}}^{\prime}(x)<{I_{2}^{*}}^{\prime}(x)$ $\forall x\in[0,z)$ . Furthermore, as $I_{1}^{*}(0)=I_{2}^{*}(0)=0$ , it must hold that $I_{1}^{*}(x)<I_{2}^{*}(x)$ $\forall x\geqslant 0$ . The proof is complete.

References

Armantier et al. (2018) Armantier, O., Foncel, J. and Treich, N. (2018) Insurance and portfolio decisions: A wealth effect puzzle. Working paper.
Arrow (1963) Arrow, K.J. (1963) Uncertainty and the welfare economics of medical care. American Economic Review, 53(5), 941-973.
Arrow (1965) Arrow, K.J. (1965) Aspects of the Theory of Risk-bearing. Helsinki: Yrjo Jahnssonin Saatio.
Arrow (1971) Arrow, K.J. (1971) Essays in the Theory of Risk Bearing. Chicago.
Bernard et al. (2015) Bernard, C., He, X.D., Yan, J.A. and Zhou, X.Y. (2015) Optimal insurance design under rank dependent utility. Mathematical Finance, 25(1), 154-186.
Borch (1960) Borch, K. (1960) An attempt to determine the optimum amount of stop loss reinsurance. In: Transactions of the 16th International Congress of Actuaries, Vol. I, 597-610. Brussels, Belgium: Georges Thone.
Carlier and Dana (2003) Carlier, G. and Dana, R.-A. (2003) Pareto efficient insurance contracts when the insurer’s cost function is discontinuous. Economic Theory, 21(4), 871-893.
Carlier and Dana (2005) Carlier, G. and Dana, R.-A. (2005) Rearrangement inequalities in non-convex insurance models. Journal of Mathematical Economics, 41, 483-503.
Chi (2012) Chi, Y. (2012) Reinsurance arrangements minimizing the risk-adjusted value of an insurer’s liability. ASTIN Bulletin: The Journal of the IAA, 42(2), 529-557.
Chi (2019) Chi, Y. (2019) On the optimality of a straight deductible under belief heterogeneity. ASTIN Bulletin: The Journal of the IAA, 49(1), 242-263.
Chi and Wei (2020) Chi, Y. and Wei, W. (2020) Optimal insurance with background risk: An analysis of general dependence structures. Finance and Stochastics, Forthcoming.
Cummins and Mahul (2004) Cummins, J.D. and Mahul, O. (2004) The demand for insurance with an upper limit on coverage. The Journal of Risk and Insurance, 71(2), 253-264.
Dionne and St-Michel (1991) Dionne, G. and St-Michel, P. (1991) Workers’ compensation and moral hazard. The Review of Economics and Statistics, 73(2), 236-244.
Doherty et al. (2015) Doherty, N.A., Laux, C. and Muermann, A. (2015) Insuring nonverifiable losses. Review of Finance, 19(1), 283-316.
Doherty and Schlesinger (1990) Doherty, N.A. and Schlesinger, H. (1990) Rational insurance purchasing: Consideration of contract non-performance. The Quarterly Journal of Economics, 105(1), 243-253.
Eeckhoudt and Kimball (1992) Eeckhoudt, L. and Kimball, M. (1992) Background risk, prudence, and the demand for insurance. Contributions to Insurance Economics, edited by G. Dionne, 239-254. New York: Springer.
Eeckhoudt et al. (2003) Eeckhoudt, L., Mahul, O. and Moran, J. (2003) Fixed‐reimbursement insurance: Basic properties and comparative statics. Journal of Risk and Insurance, 70(2), 207-218.
Ehrlich and Becker (1972) Ehrlich, I. and Becker, G.S. (1972) Market insurance, self-insurance, and self-protection. Journal of political Economy, 80(4), 623-648.
Gollier (1996) Gollier, C. (1996) Optimum insurance of approximate losses. The Journal of Risk and Insurance, 63(3), 369-380.
Gollier (2001) Gollier, C. (2001) The Economics of Risk and Time. London: The MIT Press.
Gollier (2013) Gollier, C. (2013) The economics of optimal insurance design. Handbook of Insurance, edited by G. Dionne. New York: Springer.
Gollier and Schlesinger (1996) Gollier, C. and Schlesinger, H. (1996). Arrow’s theorem on the optimality of deductibles: a stochastic dominance approach. Economic Theory, 7(2), 359-363.
Gotoh and Konno (2000) Gotoh, J.Y. and Konno, H. (2000) Third degree stochastic dominance and mean-risk analysis. Management Science, 46(2), 289-301.
Huang and Tzeng (2006) Huang, R.J. and Tzeng, L.Y. (2006) The design of an optimal insurance contract for irreplaceable commodities. The Geneva Risk and Insurance Review, 31(1), 11-21.
Hofmann (2015) Hofmann, D.M. (2015) Insurance - a global view. Second edition, Zurich Insurance Company Ltd.
Hölmstrom (1979) Hölmstrom, B. (1979) Moral hazard and observability. The Bell journal of economics, 10(1), 74-91.
Huberman et al. (1983) Huberman, G., Mayers, D. and Smith, C.W. (1983) Optimal insurance policy indemnity schedules. The Bell Journal of Economics, 14(2), 415-426.
Kaluszka (2001) Kaluszka, M. (2001) Optimal reinsurance under mean-variance premium principles. Insurance: Mathematics and Economics, 28(1), 61-67.
Karlin and Novikoff (1963) Karlin, S. and Novikoff, A. (1963) Generalized convex inequalities. Pacific Journal of Mathematics, 13(4), 1251-1279.
Kaye (2005) Kaye, P. (2005) Risk measurement in insurance: A guide to risk measurement, capital allocation and related decision support issues. Discussion paper program, Casualty Actuarial Society.
Kimball (1990) Kimball, M.S. (1990) Precautionary saving in the small and in the large. Econometrica, 58(1), 53-73.
Komiya (1988) Komiya, H. (1988) Elementary proof for sion’s minimax theorem. Kodai Mathematical Journal, 11(1), 5-7.
Markowitz (1952) Markowitz, H. (1952) Portfolio selection. Journal of Finance, 7, 77–91.
Menezes et al. (1980) Menezes, C., Geiss, C. and Tressler, J. (1980) Increasing downside risk. The American Economic Review, 70(5), 921-932.
Millo (2016) Millo, G. (2016) The income elasticity of nonlife insurance: A reassessment. Journal of Risk and Insurance, 83(2), 335-362.
Mossin (1968) Mossin, J. (1968) Aspects of rational insurance purchasing. Journal of Political Economy, 76(4), 553-568.
Noussair et al. (2014) Noussair, C.N., Trautmann, S.T. and Van de Kuilen, G. (2014) Higher order risk attitudes, demographics, and financial decisions. The Review of Economic Studies, 81(1), 325-355.
Ohlin (1969) Ohlin, J. (1969) On a class of measures of dispersion with application to optimal reinsurance. ASTIN Bulletin: The Journal of the IAA, 5(2), 249-266.
Picard (2000) Picard, P. (2000) On the design of optimal insurance policies under manipulation of audit cost. International Economic Review, 41(4), 1049-1071.
Pratt (1964) Pratt, J.W. (1964) Risk aversion in the small and in the large. Econometrica, 32, 122-136.
Raviv (1979) Raviv, A. (1979) The design of an optimal insurance policy. American Economic Review, 69(1), 84-96.
Rothschild and Stiglitz (1976) Rothschild, M. and Stiglitz, J. (1976) Equilibrium in competitive insurance markets: An essay on the economics of imperfect information. The Quarterly Journal of Economics, 90(4), 629-649
Schlesinger (1981) Schlesinger, H. (1981) The optimal level of deductibility in insurance contracts. Journal of risk and insurance, 48(3), 465-481.
Teh (2017) Teh, T.L. (2017) Insurance design in the presence of safety nets. Journal of Public Economics, 149, 47-58.
Vajda (1962) Vajda, S. (1962) Minimum variance reinsurance. ASTIN Bulletin: The Journal of the IAA, 2, 257-260.
Viscusi (1979) Viscusi, W.K. (1979) Insurance and individual incentives in adaptive contexts. Econometrica, 47(5), 1195-1207.
Whitmore (1970) Whitmore, G.A. (1970) Third-degree stochastic dominance. The American Economic Review, 60(3), 457-459.
Xu et al. (2019) Xu, Z.Q., Zhou, X.Y. and Zhuang, S.C. (2019) Optimal insurance with rank-dependent utility and incentive compatibility. Mathematical Finance, 29(2), 659-692.
Zhou and Wu (2008) Zhou, C. and Wu, C. (2008) Optimal insurance under the insurer’s risk constraint. Insurance: Mathematics and Economics, 42(3), 992-999.
Zhou et al. (2010) Zhou, C., Wu, W. and Wu, C. (2010) Optimal insurance in the presence of insurer’s loss limit. Insurance: Mathematics and Economics, 46(2), 300-307.