This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Variance Contracts

Yichun Chi China Institute for Actuarial Science, Central University of Finance and Economics, Beijing 102206, China. Email: yichun@cufe.edu.cn.    Xun Yu Zhou Department of Industrial Engineering and Operations Research and The Data Science Institute, Columbia University, New York, NY 10027, USA. Email: xz2574@columbia.edu.    Sheng Chao Zhuang Department of Finance, University of Nebraska-Lincoln, NE, USA. Email: szhuang3@unl.edu.
Abstract

We study the design of an optimal insurance contract in which the insured maximizes her expected utility and the insurer limits the variance of his risk exposure while maintaining the principle of indemnity and charging the premium according to the expected value principle. We derive the optimal policy semi-analytically, which is coinsurance above a deductible when the variance bound is binding. This policy automatically satisfies the incentive-compatible condition, which is crucial to rule out ex post moral hazard. We also find that the deductible is absent if and only if the contract pricing is actuarially fair. Focusing on the actuarially fair case, we carry out comparative statics on the effects of the insured’s initial wealth and the variance bound on insurance demand. Our results indicate that the expected coverage is always larger for a wealthier insured, implying that the underlying insurance is a normal good, which supports certain recent empirical findings. Moreover, as the variance constraint tightens, the insured who is prudent cedes less losses, while the insurer is exposed to less tail risk.

Key-words: Insurance design; expected value principle; variance; incentive compatibility; comparative statics.

1 Introduction

Insurance is an efficient mechanism to facilitate risk reallocation between two parties. Borch (1960) was the first to study the insurance contract design problem and to prove that given a fixed premium, a stop-loss (or deductible) insurance policy (i.e., full coverage above a deductible) achieves the smallest variance of the insured’s share of payment.111Borch (1960) presents the problem in a reinsurance setting, in which the ceding insurer corresponds to the insured in an insurance setting. Arrow (1963) assumes that the premium is calculated by the expected value principle (i.e., the insurance cost is proportional to the expected indemnity) and imposes the principle of indemnity (i.e., the insurer’s reimbursement is non-negative and smaller than the loss). Under these specifics, Arrow (1963) shows that the stop-loss insurance is Pareto optimal between a risk-neutral insurer and a risk-averse insured. This is a foundational result that has earned the name of Arrow’s theorem of the deductible in the literature. Mossin (1968) further proves that the deductible is strictly positive if and only if the insurance price is actuarially unfair (i.e., the safety loading is strictly positive).

In Arrow (1963), the insurer is assumed to be risk-neutral. This is based on the assumption that the insurer has a sufficiently large number of independent and homogenous insureds, such that his risk, by the law of large numbers, is sufficiently diversified to be nearly zero. This kind of theoretically ideal situation hardly occurs in practice, even if the insurer does indeed have a huge number of clients. Moreover, it does not apply to tailor-made contracts for insuring one-off events (e.g., the shipment of a highly valuable painting). Theorem 2 in Arrow (1971) stipulates that, when the insurer is risk-averse and insurance cost is absent, an optimal contract must involve coinsurance. Raviv (1979) extends this result to include nonlinear insurance costs and shows that an optimal policy involves both a deductible and coinsurance. Much of the recent research in this area has focused on the insurer’s tail risk exposure. Cummins and Mahul (2004) and Zhou et al. (2010) extend Arrow’s model by introducing an exogenous upper bound on indemnity and thereby limiting the insurer’s liability with respect to catastrophic losses. From a regulatory perspective, Zhou and Wu (2008) propose a model in which the insurer’s expected loss above a prescribed level is controlled, and they conclude that an optimal policy is generally piecewise linear. Doherty et al. (2015) investigate the case in which losses are nonverifiable and deduce that a contract with a deductible and an endogenous upper limit is optimal.

As important as tail risks are for both parties in insurance contracts, in practice insurers are also concerned with other parts of the loss distribution. Kaye (2005), in a Casualty Actuarial Society report, writes

“Different stakeholders have different levels of interest in different parts of the distribution - the perspective of the decision-maker is important. Regulators and rating agencies will be focused on the extreme downside where the very existence of the company is in doubt. On the other hand, management and investors will have a greater interest in more near-term scenarios towards the middle of the distribution and will focus on the likelihood of making a profit as well as a loss” (p. 4).

Assuming the insurer to be risk-averse with a concave utility function indeed takes the whole risk distribution into consideration, as studied in Arrow (1971) and Raviv (1979). However, there are notable drawbacks to the utility function approach. The notion of utility is opaque for many non-specialists, and the benefit–risk tradeoff is only implied, implicitly, through a utility function. Moreover, one can rarely obtain, analytically, optimal policies with a general utility, hindering post-optimality analyses such as comparative statics. For instance, Raviv (1979) derives a differential equation satisfied by the optimal indemnity, which takes a rather complex form depending on the utility function used.

By contrast, variance, as a measure of risk originally put forth in Markowitz’s pioneering work (Markowitz, 1952), is also related to the whole distribution, yet it is more intuitive and transparent. Borch (1960) designs a contract that aims to minimize the variance of the insured’s liability. Kaluszka (2001) extends Borch (1960)’s work by incorporating a variance-related premium principle and shows that the optimal contract to minimize the variance of the insured’s payment can be stop-loss, quota share (i.e., the insurer covers a constant proportion of the loss) or a combination of the two. Vajda (1962) studies the problem from the insurer’s perspective, and shows that a quota share policy minimizes the variance of indemnity in an actuarially fair contract. However, his result depends critically on limiting the admissible contracts to be such that the ratio between indemnity and loss increases as the loss increases, a feature that enables the derivation of a solution through rather simple calculus.222Vajda (1962) claims that this feature “agrees with the spirit of (re)insurance, at least in most cases” (p. 259). However, for a larger loss, it is indeed in the spirit of insurance that the insurer should pay more, but it is not clear why he should be responsible for a higher proportion. Interestingly, our results will show that the optimal policies of our model possess this property if the insured is prudent; see Corollary 3.9.

In this paper, we revisit the work of Arrow (1963) by imposing a variance constraint on the insurer’s risk exposure. Unlike Vajda (1962), we consider the general actuarially unfair case and remove the restriction that the proportion of the insurer’s payment increases with the size of the loss. The presence of the variance constraint causes substantial technical challenges in solving the problem. In the literature, there are generally two approaches used to study variants of Arrow’s model: those involving sample-wise optimization and stochastic orders. However, the former fails to work for our problem due to the nonlinearity of the variance constraint, and the latter is not readily applicable either because the presence of the variance constraint invalidates the claim that any admissible contract is dominated by a stop-loss one. The first contribution of this paper is methodological: we develop a new approach by combining the techniques of stochastic orders, calculus of variations and the Lagrangian duality to derive optimal insurance policies. The solutions are semi-analytical in the sense that they can be computed by solving some algebraic equations (as opposed to differential equations in Raviv 1979).

Because the expected value premium principle ensures the expected profit of the insurer, our model is essentially a mean–variance model à la Markowitz for the insurer. Our second contribution is actuarial: we show that the optimal contract is coinsurance above a deductible when the variance constraint is binding. Moreover, the deductible disappears if and only if the insurance price is actuarially fair, consistent with Mossin’s Theorem (Mossin, 1968). These results are qualitatively similar to those of Raviv (1979), who uses a concave utility function for the insurer. A natural question is why one would bother to study the mean–variance version of a problem that would generate contracts with similar characteristics to its expected utility counterpart. This question can be answered in the same way as in the field of financial portfolio selection, where there is an enormously large body of study on the Markowitz mean–variance model along with its popularity in practice, despite the existence of the equally well-studied expected utility maximization models. In other words, expected utility and mean–variance are two different frameworks, and, as argued earlier, the latter underlines a more transparent and explicit return–risk tradeoff, which usually leads to explicit solutions.

Our optimal policies involve coinsurance, which is widely utilized in the insurance industry. As pointed out by Raviv (1979), risk aversion on the part of the insurer could be a cause for coinsurance, but other attributes such as the nonlinearity of the insurance cost function could also lead to coinsurance. Another explanation for coinsurance is to mitigate the moral hazard risk; see Hölmstrom (1979) and Dionne and St-Michel (1991). From the insured’s perspective, Doherty and Schlesinger (1990) argue that default risk of the insurer can motivate the insured to choose coinsurance. Picard (2000) also shows that coinsurance is optimal in order to reduce the risk premium paid to the auditor. In this paper, we prove that optimal policies can turn from full insurance to coinsurance as the variance bound tightens, thereby providing a novel yet simple reason for the prevalent feature of coinsurance in insurance theory and practice: a variance bound on the insurer’s risk exposure.

Intriguingly, our optimal insurance polices automatically satisfy the so-called incentive-compatible condition that both the insured and the insurer pay more for a larger loss (or, equivalently, the marginal indemnity is between 0 and 1).333The incentive-compatible condition is termed the no-sabotage condition in Carlier and Dana (2003). In Arrow (1963)’s setting, the optimal contract – the stop-loss one – turns out to be incentive-compatible; however, this is generally untrue. Gollier (1996) considers an insured facing an additional background risk that is not insurable. Under the expected value principle, he discovers that the optimal insurance, which relies heavily on the dependence between the background risk and the loss, may render the marginal indemnity strictly larger than 1. Bernard et al. (2015) generalize the insured’s risk preference from expected utility to rank-dependent utility involving probability distortion (weighting), and also find that the optimal indemnity may decrease when the loss increases. In both of these papers, the derived optimal contracts would incentivize the insured to misreport the actual losses, leading to ex post moral hazard. Equally absurd would be the case in which the insurer pays less for a larger loss. To address this issue, Huberman et al. (1983) propose the incentive-compatible condition as a hard constraint on admissible insurance policies, in addition to the principle of indemnity. Xu et al. (2019) add this constraint to the model of Bernard et al. (2015), painstakingly developing a completely different approach in order to overcome the difficulty arising out of this additional constraint and deriving qualitatively very different contracts. On the other hand, Raviv (1979) discovers that his optimal solution is incentive-compatible, assuming that the loss has a strictly positive probability density function. Carlier and Dana (2005) use a Hardy–Littlewood rearrangement argument to prove that any optimal contract is dominated by an incentive-compatible contract, establishing the optimality of the latter. However, their approach relies heavily on the assumption that the loss is non-atomic. Both of these studies rule out the important and practical case in which the loss is atomic at 0. By contrast, in the presence of the variance constraint, we show that the optimal policy is naturally incentive-compatible even without the corresponding hard constraint or the assumption of an atom-less loss.

Our final contribution is a comparative statics analysis examining the respective effects of the insured’s wealth and the variance constraint on insurance demand under actuarially fair pricing. Our results indicate that the presence of a variance bound fundamentally changes the insurance strategy – it makes the insured’s wealth relevant and it changes the way in which the two parties share the risk. In particular, the expected coverage is always larger for a wealthier insured who has strictly decreasing absolute prudence (DAP), rendering the insurance product a normal good. This finding provides some theoretical foundation for the empirical observations of Millo (2016) and Armantier et al. (2018). Moreover, we show that the insurer has less downside risk when contracting with a wealthier insured with strictly DAP.444Menezes et al. (1980) introduce the notion of downside risk to compare two risks with the same mean and variance. The formal definition is given in Section 2.2. This result reconciles with the well-documented phenomenon that more economically advanced regions or countries have higher insurance densities and penetrations. On the other hand, we establish that the variance bound significantly changes a prudent insured’s risk transfer decision – she would consistently transfer more losses as the variance bound loosens. A corollary of this result is, rather surprisingly, that the insurer can reduce the tail risk by simply tightening the variance constraint. This suggests that our variance contracts do, after all, address the issue of tail exposure.

The rest of the paper proceeds as follows. In Section 2 we formulate the problem and present some preliminaries about risk preferences. In Section 3 we develop the solution approach and present the optimal insurance contracts. In Section 4 we conduct a comparative analysis by examining the effects of the insured’s initial wealth and the variance constraint on insurance demand. Section 5 concludes the paper. Some auxiliary results and all proofs are relegated to the appendices.

2 Problem Formulation and Preliminaries

2.1 Variance contracts formulation

An insured endowed with an initial wealth w0w_{0} faces an insurable loss XX, which is a non-negative, essentially bounded random variable defined on a probability space (Ω,,)(\Omega,\mathscr{F},{\mathbb{P}}) with the cumulative distribution function (c.d.f.) FX(x):={Xx}F_{X}(x):={\mathbb{P}}\{X\leqslant x\} and the essential supremum <\mathcal{M}<\infty. An insurance contract design problem is to partition XX into two parts, I(X)I(X) and XI(X)X-I(X), where I(X)I(X) (the indemnity) is the portion of the loss that is ceded to the insurer (“he”) and RI(X):=XI(X)R_{I}(X):=X-I(X) (the retention) is the portion borne by the insured (“she”). II and RIR_{I} are also called the insured’s ceded and retained loss functions, respectively. It is natural to require a contract to satisfy the principle of indemnity, namely the indemnity is non-negative and less than the amount of loss. Thus, the feasible set of indemnity functions is

:={I:0I(x)x,x[0,]}.\mathfrak{C}:=\left\{I:0\leqslant I(x)\leqslant x,\,\forall x\in[0,\mathcal{M}]\right\}.

As the insurer covers part of the loss for the insured, he is compensated by collecting the premium from her. Following many studies in the literature, we assume that the insurer calculates the premium using the expected value principle. Specifically, the premium on making a non-negative random payment YY is charged as

π(Y)=(1+ρ)𝔼[Y]\pi(Y)=(1+\rho){\mathbb{E}}[Y]

where ρ0\rho\geqslant 0 is the so-called safety loading coefficient. His risk exposure under a contract II for a loss XX is hence

eI(X)=I(X)π(I(X)).e_{I}(X)=I(X)-\pi(I(X)).

The insurer may evaluate this risk using different measures for different purposes, as Kaye (2005) notes. In this paper, we assume that the insurer has sufficient regulatory capital and therefore focuses on the volatility of the underwriting risk. Specifically, he uses the variance to measure the risk and requires

var[eI(X)]var[I(X)]ν\displaystyle var[e_{I}(X)]\equiv var[I(X)]\leqslant\nu

for some prescribed ν>0\nu>0.

On the other hand, denote by WI(X)W_{I}(X) the insured’s final wealth under contract II upon its expiration, namely

WI(X)=w0X+I(X)π(I(X)).W_{I}(X)=w_{0}-X+I(X)-\pi(I(X)).

The insured’s risk preference is characterized by a von Neumann–Morgenstern utility function UU satisfying U>0U^{\prime}>0 and U′′<0U^{\prime\prime}<0.

Our optimal contracting problem is, therefore,

maxI𝔼[U(WI(X))]subject tovar[eI(X)]ν.\begin{array}[]{ll}\underset{I\in\mathfrak{C}}{\text{max}}&\ \ \ {\mathbb{E}}[U(W_{I}(X))]\\ \mbox{subject to}&\ \ \ var\left[e_{I}(X)\right]\leqslant\nu.\end{array} (2.1)

Note that this model reduces to Arrow (1963)’s model

maxI𝔼[U(WI(X))]\max_{I\in\mathfrak{C}}\ \ \ {\mathbb{E}}[U(W_{I}(X))]

by setting the upper bound ν\nu to be 𝔼[X2]{\mathbb{E}}[X^{2}]. This is because var[eI(X)]=var[I(X)]𝔼[X2]var[e_{I}(X)]=var\left[I(X)\right]\leqslant{\mathbb{E}}[X^{2}] for all II\in\mathfrak{C}.

In Problem (2.1), the insured’s benefit–risk consideration is captured by the utility function UU, whereas the insurer’s return–risk tradeoff is reflected by the “mean” (the expected value principle) and the “variance” (the variance bound). One may interpret the problem as one faced by an insurer who likes to design a contract with the best interest of a representative insured in mind, so as to remain marketable and competitive, while maintaining the desired profitability and variance control in the mean–variance sense.555Representative insureds in different wealth classes or different regions may have different “typical” levels of initial wealth. Moreover, when the economy grows, a representative insured’s initial wealth may change substantially. As shown in Subsection 4.1, the change in the insured’s initial wealth may affect her demand for insurance. Problem (2.1) can also model a tailor-made contract design for insuring a one-off event from an insured’s perspective. The insured aims to maximize her expected utility while accommodating the insurer’s participation constraint reflected by the mean and variance specifications.

2.2 Absolute risk aversion and prudence

The Arrow–Pratt measure of absolute risk aversion (Pratt, 1964; Arrow, 1965), defined as

𝒜(x):=U′′(x)U(x),\displaystyle\mathcal{A}(x):=-\frac{U^{\prime\prime}(x)}{U^{\prime}(x)},

captures the dependence of the level of risk aversion on the agent’s wealth xx. If 𝒜(x)\mathcal{A}(x) is decreasing666Throughout the paper, the terms “increasing” and “decreasing” mean “non-decreasing” and “non-increasing,” respectively. in xx, then the insured’s risk preference is said to exhibit decreasing absolute risk aversion (DARA). The effect of an insured’s initial wealth on the insurance demand under Arrow (1963)’s model has been widely studied in the literature. It is found that a wealthier DARA insured purchases a deductible insurance with a higher deductible. For a survey on how insureds’ wealth impacts insurance, see e.g., Gollier (2001, 2013).

While risk aversion (U′′<0U^{\prime\prime}<0) captures an insured’s propensity for avoiding risk, prudence (i.e., U′′′>0U^{\prime\prime\prime}>0) reflects her tendency to take precautions against future risk. Many commonly used utility functions, including those with hyperbolic absolute risk aversion (HARA) and mixed risk aversion, are prudent.777A utility function is called HARA if the reciprocal of the Arrow–Pratt measure of absolute risk aversion is a linear function, i.e., 𝒜U(x)=U′′(x)U(x)=1px+q\mathcal{A}_{U}(x)=-\frac{U^{\prime\prime}(x)}{U^{\prime}(x)}=\frac{1}{px+q} for some p0p\geqslant 0 and qq. It includes exponential, logarithmic and power utility functions as special cases. For further discussion of HARA, see Gollier (2001). A utility function is said to be of mixed risk aversion if (1)nU(n)(x)0(-1)^{n}U^{(n)}(x)\leqslant 0 for all xx and n=1,2,3,n=1,2,3,\cdots, where U(n)U^{(n)} denotes the nnth derivative of UU. Based on an experiment with a large number of subjects, Noussair et al. (2014) observe that the majority of individuals’ decisions are consistent with prudence. Eeckhoudt and Kimball (1992) and Gollier (1996) take into account the insured’s prudence in designing optimal insurance policies. The degree of absolute prudence is defined as

𝒫(x):=U(x)′′′U(x)′′\displaystyle\mathcal{P}(x):=-\frac{U{{}^{\prime\prime\prime}}(x)}{U{{}^{\prime\prime}}(x)} (2.2)

for a three-time differentiable utility function UU. If 𝒫(x)\mathcal{P}(x) is strictly decreasing in xx, then the insured is said to exhibit strictly decreasing absolute prudence (DAP). Kimball (1990) shows that DAP characterizes the notion that wealthier people are less sensitive to future risks. Moreover, DAP implies DARA, as noted in Proposition 21 of Gollier (2001).

A term related to prudence is third-degree stochastic dominance (TSD), which was introduced by Whitmore (1970). A non-negative random variable Z1Z_{1} is said to dominate another non-negative random variable Z2Z_{2} in TSD if

𝔼[Z1]𝔼[Z2]and0x0yFZ2(z)FZ1(z)dzdy0for allx0.{\mathbb{E}}[Z_{1}]\geqslant{\mathbb{E}}[Z_{2}]\quad\text{and}\quad\int_{0}^{x}\int_{0}^{y}F_{Z_{2}}(z)-F_{Z_{1}}(z){\mathrm{d}}z{\mathrm{d}}y\geqslant 0\,\ \text{for all}\,x\geqslant 0.

Equivalently, Z1Z_{1} dominates Z2Z_{2} in TSD if and only if 𝔼[u(Z1)]𝔼[u(Z2)]{\mathbb{E}}[u(Z_{1})]\geqslant{\mathbb{E}}[u(Z_{2})] for all functions uu satisfying u>0,u′′<0u^{\prime}>0,\;u^{\prime\prime}<0 and u′′′>0u^{\prime\prime\prime}>0. TSD has been widely employed for decision making in finance and insurance. For instance, Gotoh and Konno (2000) use it to study mean-variance optimal portfolio problems. If Z1Z_{1} dominates Z2Z_{2} in TSD and they have the same mean and variance, then Z1Z_{1} is said to have less downside risk than Z2Z_{2}. In fact, the latter is equivalent to 𝔼[u(Z1)]𝔼[u(Z2)]{\mathbb{E}}[u(Z_{1})]\geqslant{\mathbb{E}}[u(Z_{2})] for any function uu with u′′′>0u^{\prime\prime\prime}>0; see Menezes et al. (1980).

3 Optimal Contracts

In this section, we present our approach to solving Problem (2.1).

First, consider Problem (2.1) without the variance constraint:

maxI𝔼[U(WI(X))].\max_{{I\in\mathfrak{C}}}\ \ \ {\mathbb{E}}[U(W_{I}(X))]. (3.1)

This is the classical Arrow (1963)’s model, for which the optimal contract is a deductible one of the form (xd)+(x-d^{*})_{+} for some non-negative deductible dd^{*}, where (x)+:=max{x,0}(x)_{+}:=\max\{x,0\}. This contract automatically satisfies the incentive-compatible condition. Moreover, Chi (2019) (see Theorem 4.2 therein) was the first to derive an analytical form of the optimal deductible level dd^{*}. More precisely, define

VaR11+ρ(X):=inf{x[0,]:FX(x)ρ1+ρ}VaR_{\frac{1}{1+\rho}}(X):=\inf\left\{x\in[0,\mathcal{M}]:F_{X}(x)\geqslant\frac{\rho}{1+\rho}\right\}

and

φ(d):=𝔼[U(W(xd)+(X))]U(w0dπ((Xd)+)),0d<,\varphi(d):=\frac{{\mathbb{E}}[U^{\prime}(W_{(x-d)_{+}}(X))]}{U^{\prime}(w_{0}-d-\pi((X-d)_{+}))},\quad 0\leqslant d<\mathcal{M},

where inf:=\inf\emptyset:=\mathcal{M} by convention. Then the optimal dd^{*} is

d=sup{VaR11+ρ(X)d<:φ(d)11+ρ}VaR11+ρ(X),d^{*}=\sup\left\{VaR_{\frac{1}{1+\rho}}(X)\leqslant d<\mathcal{M}:\,\varphi(d)\geqslant\frac{1}{1+\rho}\right\}\vee VaR_{\frac{1}{1+\rho}}(X), (3.2)

where sup:=0\sup\emptyset:=0 and xy:=max{x,y}x\vee y:=\max\{x,y\}.888The number dd^{*} can be numerically computed easily, because φ(d)\varphi(d) is decreasing over [VaR11+ρ(X),)[VaR_{\frac{1}{1+\rho}}(X),\mathcal{M}); see Chi (2019). This leads immediately to the following proposition.

Proposition 3.1.

If νvar[(Xd)+]\nu\geqslant var[(X-d^{*})_{+}], then I(x)=(xd)+I(x)=(x-d^{*})_{+} is the optimal solution to Problem (2.1).

Intuitively, if the variance bound ν\nu is set sufficiently high, then the variance constraint in Problem (2.1) is redundant and the problem reduces to the classical Arrow (1963)’s problem. Proposition 3.1 tells exactly and explicitly what the bound should be for the variance constraint to be binding.

Therefore, it suffices to solve Problem (2.1) for the case in which ν<var[(Xd)+]\nu<var[(X-d^{*})_{+}], which we now set as an assumption.

Assumption 3.1.

The variance bound ν\nu satisfies ν<var[(Xd)+]\nu<var[(X-d^{*})_{+}].

The main thrust for finding the solution is to first restrict the analysis with a fixed level of expected indemnity and then find the optimal level of expected indemnity. To this end, we need to first identify the range in which the optimal expected indemnity possibly lies. Noting that var[(Xd)+]var[(X-d)_{+}] is strictly decreasing and continuous in dd over [essinfX,)[{\rm ess\ inf}\;X,\mathcal{M}), we define

dL:=inf{dd:var[(Xd)+]ν}andmL:=𝔼[(XdL)+].d_{L}:=\inf\{d\geqslant d^{*}:var[(X-d)_{+}]\leqslant\nu\}\quad\text{and}\quad m_{L}:={\mathbb{E}}[(X-d_{L})_{+}].

Intuitively, the insurer would demand a deductible higher than Arrow’s level dd^{*} due to the additional risk control reflected by the variance constraint, and dLd_{L} is the smallest deductible that makes this constraint binding.

Lemma 3.2.

Under Assumption 3.1, for Problem (2.1), any admissible insurance policy II with 𝔼[I(X)]mL{\mathbb{E}}[I(X)]\leqslant m_{L} is no better than the deductible contract ILI_{L} where IL(x)=(xdL)+I_{L}(x)=(x-d_{L})_{+}.

Therefore, we can rule out any contract whose expected indemnity is strictly smaller than mLm_{L}; in other words, mLm_{L} is a lower bound of the optimal expected indemnity. In particular, no-insurance (i.e., I(x)0I^{*}(x)\equiv 0) is never optimal under Assumption 3.1.

Next, we are to derive an upper bound of the optimal expected indemnity. Consider a loss-capped contract XkX\wedge k, where xy:=min{x,y}x\wedge y:=\min\{x,y\} and k0k\geqslant 0, which pays the actual loss up to the cap kk.999A loss-capped contract is also called “full insurance up to a (policy) limit” or “full insurance with a cap.” Define

KU:=inf{k0:var[Xk]ν}andmU:=𝔼[XKU].K_{U}:=\inf\{k\geqslant 0:var[X\wedge k]\geqslant\nu\}\quad\text{and}\quad m_{U}:={\mathbb{E}}[X\wedge K_{U}].

In the above, KUK_{U} is well-defined because Xk𝔼[Xk]X\wedge k-{\mathbb{E}}[X\wedge k] is increasing in kk in the sense of convex order, according to Lemma A.2 in Chi (2012).101010A random variable YY is said to be greater than a random variable ZZ in the sense of convex order, denoted as ZcxYZ\leqslant_{cx}Y, if 𝔼[Y]=𝔼[Z]and𝔼[(Zd)+]𝔼[(Yd)+]d,{\mathbb{E}}[Y]={\mathbb{E}}[Z]\ \text{and}\ \ {\mathbb{E}}[(Z-d)_{+}]\leqslant{\mathbb{E}}[(Y-d)_{+}]\;\;\forall d\in\mathbb{R}, provided that the expectations exist. Obviously, ZcxYZ\leqslant_{cx}Y implies var[Z]var[Y]var[Z]\leqslant var[Y]. Clearly, both KUK_{U} and mUm_{U} depend on the variance bound ν\nu. Since var[X]var[(Xd)+]>νvar[X]\geqslant var[(X-d^{*})_{+}]>\nu, we have

KU<,mU<𝔼[X]andvar[XKU]=ν.K_{U}<\mathcal{M},\quad m_{U}<{\mathbb{E}}[X]\quad\text{and}\quad var\left[X\wedge K_{U}\right]=\nu.
Lemma 3.3.

For any II\in\mathfrak{C} with var[I(X)]νvar[I(X)]\leqslant\nu, we must have 𝔼[I(X)]mU{\mathbb{E}}[I(X)]\leqslant m_{U}. Moreover, if II\in\mathfrak{C} satisfies var[I(X)]νvar[I(X)]\leqslant\nu and 𝔼[I(X)]=mU{\mathbb{E}}[I(X)]=m_{U}, then I(X)=XKUI(X)=X\wedge K_{U} almost surely.

This lemma stipulates that mUm_{U} is an upper bound of the optimal expected indemnity. Moreover, any admissible contract achieving this upper bound is equivalent to the loss-capped contract XKUX\wedge K_{U}. An immediate corollary of the lemma is mLmUm_{L}\leqslant m_{U}, noting that var[(XdL)+]=νvar[(X-d_{L})_{+}]=\nu.

The following result identifies the case mL=mUm_{L}=m_{U} as a trivial one.

Proposition 3.4.

If mL=mUm_{L}=m_{U}, then the loss XX must follow a Bernoulli distribution with values 0 and dL+KUd_{L}+K_{U}. Moreover, under Assumption 3.1, the optimal contract of Problem (2.1) is

I(0)=0andI(dL+KU)=KU.I^{*}(0)=0\quad\text{and}\quad I^{*}(d_{L}+K_{U})=K_{U}.

In what follows, we consider the general and interesting case in which mL<mUm_{L}<m_{U}. For m(mL,mU)m\in(m_{L},m_{U}), define

m:={I:var[I(X)]ν,𝔼[I(X)]=m}.\mathfrak{C}_{m}:=\left\{I\in\mathfrak{C}:var[I(X)]\leqslant\nu,\,{\mathbb{E}}[I(X)]=m\right\}.

We now focus on the following optimization problem

maxIm𝔼[U(WI(X))],\max_{I\in\mathfrak{C}_{m}}\ \ {\mathbb{E}}[U(W_{I}(X))], (3.3)

which is a “cross section” of the original problem (2.1) where the expected indemnity is fixed as mm.

For λ\lambda\in{\mathbb{R}} and β0\beta\geqslant 0, denote

Iλ,β(x):=sup{y[0,x]:U(w0x+y(1+ρ)m)λ2βy0},x[0,].\displaystyle I_{\lambda,\beta}(x):=\sup\Big{\{}y\in[0,x]:U^{\prime}\left(w_{0}-x+y-(1+\rho)m\right)-\lambda-2\beta y\geqslant 0\Big{\}},\;x\in[0,\mathcal{M}]. (3.4)

Actually, Iλ,βI_{\lambda,\beta} is a contract that coinsures above a deductible or coinsures following full insurance, depending on the relative values between λ\lambda and U(w0(1+ρ)m)U^{\prime}(w_{0}-(1+\rho)m). To see this, when λU(w0(1+ρ)m)\lambda\geqslant U^{\prime}(w_{0}-(1+\rho)m), we have

Iλ,β(x)={0,0xw0(1+ρ)m(U)1(λ),fλ,β(x),w0(1+ρ)m(U)1(λ)<x,I_{\lambda,\beta}(x)=\left\{\begin{array}[]{ll}0,&0\leqslant x\leqslant w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda),\\ f_{\lambda,\beta}(x),&w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda)<x\leqslant\mathcal{M},\end{array}\right. (3.5)

and when λ<U(w0(1+ρ)m)\lambda<U^{\prime}(w_{0}-(1+\rho)m), we have

Iλ,β(x)={x,0xU(w0(1+ρ)m)λ2β,fλ,β(x),U(w0(1+ρ)m)λ2β<x,I_{\lambda,\beta}(x)=\left\{\begin{array}[]{ll}x,&0\leqslant x\leqslant\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta},\\ f_{\lambda,\beta}(x),&\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta}<x\leqslant\mathcal{M},\end{array}\right. (3.6)

where fλ,β(x)f_{\lambda,\beta}(x) satisfies the following equation in yy:111111It can be shown easily that this equation has a unique solution.

U(w0x+y(1+ρ)m)λ2βy=0.U^{\prime}(w_{0}-x+y-(1+\rho)m)-\lambda-2\beta y=0. (3.7)

Moreover, it is easy to see that 0fλ,β(x)x0\leqslant f_{\lambda,\beta}(x)\leqslant x either when λU(w0(1+ρ)m)\lambda\geqslant U^{\prime}(w_{0}-(1+\rho)m) and w0(1+ρ)m(U)1(λ)<xw_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda)<x\leqslant\mathcal{M}, or when λ<U(w0(1+ρ)m)\lambda<U^{\prime}(w_{0}-(1+\rho)m) and U(w0(1+ρ)m)λ2β<x\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda}{2\beta}<x\leqslant\mathcal{M}. Furthermore,

fλ,β(x)=U′′(w0x+fλ,β(x)(1+ρ)m)2βU′′(w0x+fλ,β(x)(1+ρ)m)(0,1].\displaystyle f^{\prime}_{\lambda,\beta}(x)=\frac{-U^{\prime\prime}(w_{0}-x+f_{\lambda,\beta}(x)-(1+\rho)m)}{2\beta-U^{\prime\prime}(w_{0}-x+f_{\lambda,\beta}(x)-(1+\rho)m)}\in(0,1]. (3.8)

The following result indicates that there exists an optimal solution to Problem (3.3) that is in the form of Iλ,β(x)I_{\lambda,\beta}(x) and binds both the mean and variance constraints.

Proposition 3.5.

Suppose Assumption 3.1 holds and mL<mUm_{L}<m_{U}. Then there exist λm\lambda^{*}_{m}\in{\mathbb{R}} and βm>0\beta^{*}_{m}>0 such that Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} satisfies

𝔼[Iλm,βm(X)]=mandvar[Iλm,βm(X)]=ν,{\mathbb{E}}[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=m\qquad\text{and}\qquad var[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=\nu, (3.9)

and is an optimal solution to Problem (3.3).

Combining Lemma 3.2, Lemma 3.3 and Proposition 3.5 yields that we can always find an optimal contract in one of the following three types: a deductible one of the form IL(x)=(xdL)+I_{L}(x)=(x-d_{L})_{+}, a loss-capped one of the form IU(x)=xKUI_{U}(x)=x\wedge K_{U} and a general one of the form Iλm,βm(x)I_{\lambda^{*}_{m},\beta^{*}_{m}}(x). In other words, the optimal solutions of the following maximization problem

maxI{IL,IU,Iλm,βmform(mL,mU)}𝔼[U(WI(X))],\max_{I\in\big{\{}I_{L},\ I_{U},\ I_{\lambda^{*}_{m},\beta^{*}_{m}}\,\text{for}\,m\in(m_{L},m_{U})\big{\}}}{\mathbb{E}}[U(W_{I}(X))], (3.10)

where mL<mUm_{L}<m_{U}, also solve Problem (2.1).

Note that ILI_{L}, IUI_{U} and Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} all satisfy the incentive-compatible condition (see (3.8)); hence, so does at least one of the optimal contracts II^{*} of (2.1). That is, I(0)=0I^{*}(0)=0 and 0I(x)10\leqslant{I^{*}}^{\prime}(x)\leqslant 1 almost everywhere.121212As will be evident in the sequel, the values of II^{\prime} on a set with zero Lebesgue measure have no impact on II. Therefore, we will often omit the phrase “almost everywhere” in statements regarding the marginal indemnity function I{I}^{\prime} throughout this paper. Therefore, it suffices to solve the following maximization problem

maxI𝒞𝔼[U(WI(X))]subject tovar[eI(X)]ν,\begin{array}[]{ll}\underset{I\in\mathcal{IC}}{\text{max}}&\ \ \ {\mathbb{E}}[U(W_{I}(X))]\\ \mbox{subject to}&\ \ \ var\left[e_{I}(X)\right]\leqslant\nu,\end{array} (3.11)

where

𝒞:={I:I(0)=0, 0I(x)1,x[0,]},\mathcal{IC}:=\left\{I:I(0)=0,\,0\leqslant I^{\prime}(x)\leqslant 1,\,\forall x\in[0,\mathcal{M}]\right\}\subsetneq\mathfrak{C}, (3.12)

to obtain an optimal contract for Problem (2.1).

Notice that 𝒞\mathcal{IC} is convex on which 𝔼[U(WI(X))]{\mathbb{E}}[U(W_{I}(X))] is strictly concave. Using the convex property of variance and applying arguments similar to those in the proof of Proposition 3.1 in Chi and Wei (2020), we obtain the following proposition:

Proposition 3.6.
  • (i)

    There exist optimal solutions to Problem (3.11).

  • (ii)

    Assume either ρ>0\rho>0 or {X<ϵ}>0\mathbb{P}\left\{X<\epsilon\right\}>0 for all ϵ>0\epsilon>0. Then there exists a unique solution to Problem (3.11) in the sense that I1(X)=I2(X)I_{1}(X)=I_{2}(X) almost surely for any two solutions I1I_{1} and I2I_{2}.

Note that the assumptions in Proposition 3.6-(ii) are satisfied in most situations of practical interest because either an insurer naturally sets a positive safety loading, or a loss actually never occurs with a positive probability, or both happen. On the other hand, since any optimal solution to Problem (3.11) also solves Problem (2.1), Proposition 3.6-(i) establishes the existence of optimal solutions to the latter.131313It is difficult to prove the existence of solutions to Problem (2.1) directly because its feasible set is not compact only under the principle of indemnity. Moreover, the argument proving Proposition 3.1 in Chi and Wei (2020) can be used to show that Proposition 3.6-(ii) holds true for Problem (2.1) as well. Finally, now that we have the existence and uniqueness of the optimal solutions for both Problems (3.11) and (2.1), we conclude that these two problems are indeed equivalent under the assumptions of Proposition 3.6-(ii).

While the analysis of Problem (2.1) is simplified to Problem (3.10), it remains challenging to solve this problem because λm\lambda^{*}_{m} and βm\beta^{*}_{m} are implicit functions of mm. Before attacking this problem, we introduce a useful result that provides a general qualitative structure for the optimal indemnity function in Problem (3.11) or, equivalently, Problem (2.1).

Proposition 3.7.

Under Assumption 3.1, if II^{*} is a solution to Problem (3.11), then there exists β>0\beta^{*}>0 such that

I(x)={1,ΦI(x)>0,cI(x),ΦI(x)=0,0,ΦI(x)<0,{I^{*}}^{\prime}(x)=\left\{\begin{array}[]{ll}1,&\Phi_{I^{*}}(x)>0,\\ c_{I^{*}}(x),&\Phi_{I^{*}}(x)=0,\\ 0,&\Phi_{I^{*}}(x)<0,\end{array}\right. (3.13)

for some function cIc_{I^{*}} bounded on [0,1][0,1], where

ΦI(x):=𝔼[U(WI(X))2βI(X)|X>x]((1+ρ)𝔼[U(WI(X))]2βI(X)),x[0,)\Phi_{I}(x):={\mathbb{E}}\big{[}U^{\prime}(W_{I}(X))-2\beta^{*}I(X)|X>x\big{]}-\big{(}(1+\rho){\mathbb{E}}\big{[}U^{\prime}(W_{I}(X))\big{]}-2\beta^{*}I(X)\big{)},\;x\in[0,\mathcal{M}) (3.14)

for I𝒞I\in\mathcal{IC}.

Note that (3.13) does not entail an explicit expression of I{I^{*}}^{\prime} because its right hand side also depends on II^{*} as well as on an unknown parameter β\beta^{*}. While deriving the optimal solution II^{*} directly from (3.13) seems challenging, the equation reveals the important property that I{I^{*}}^{\prime} must take a value of either 0 or 1, except at point(s) xx where ΦI(x)=0\Phi_{I^{*}}(x)=0.141414From the control theory perspective, (3.13) corresponds to an optimal control problem in which I{I}^{\prime} is taken as the control variable. Moreover, the optimal control turns out to be of the so-called “bang-bang” type, whose values depend on the sign of the discriminant function ΦI\Phi_{I}. This type of optimal control problem arises when the Hamiltonian depends linearly on control and the control is constrained between an upper bound and a lower bound. It is usually hard to solve for optimal control when the discriminant function is complex, which is the case here. This property will in turn help us to decide whether the optimal contract is of the form (xdL)+,xKU(x-d_{L})_{+},\ x\wedge K_{U}, or Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}}.

The following theorem presents a complete solution to Problem (2.1).

Theorem 3.8.

Under Assumption 3.1 and assume that the c.d.f. FXF_{X} is strictly increasing on (0,)(0,\mathcal{M}). We have the following conclusions:

  • (i)

    If ρ=0\rho=0, then the optimal indemnity function is II^{*}, where I(x)I^{*}(x) solves the following equation in yy for all x(0,]x\in(0,\mathcal{M}]:

    U(w0x+ym)2βyU(w0m)=0,y(0,x),U^{\prime}(w_{0}-x+y-m^{*})-2\beta^{*}y-U^{\prime}(w_{0}-m^{*})=0,\,\;y\in(0,x), (3.15)

    with the parameters m(mL,mU)m^{*}\in(m_{L},m_{U}) and β>0\beta^{*}>0 determined by

    𝔼[I(X)]=mandvar[I(X)]=ν.{\mathbb{E}}[I^{*}(X)]=m^{*}\quad\text{and}\quad var[I^{*}(X)]=\nu. (3.16)
  • (ii)

    If ρ>0\rho>0, then the optimal indemnity function is

    I(x)={0,0xd~,f(x),d~<x,I^{*}(x)=\left\{\begin{array}[]{ll}0,&0\leqslant x\leqslant\tilde{d},\\ f^{*}(x),&\tilde{d}<x\leqslant\mathcal{M},\end{array}\right. (3.17)

    where f(x)f^{*}(x) satisfies f(d~)=0f^{*}(\tilde{d})=0 and solves the following equation in yy:

    U(w0(1+ρ)mx+y)U(w0(1+ρ)md~)\displaystyle U^{\prime}(w_{0}-(1+\rho)m^{*}-x+y)-U^{\prime}(w_{0}-(1+\rho)m^{*}-\tilde{d}) (3.18)
    =ymρ(U(w0(1+ρ)md~)(1+ρ)𝔼[U(w0(1+ρ)mXd~)]),y(0,x),\displaystyle\quad=\frac{y}{m^{*}\rho}\left(U^{\prime}(w_{0}-(1+\rho)m^{*}-\tilde{d})-(1+\rho){\mathbb{E}}[U^{\prime}(w_{0}-(1+\rho)m^{*}-X\wedge\tilde{d})]\right),\,\;y\in(0,x),

    and d~(VaR11+ρ(X),)\tilde{d}\in(VaR_{\frac{1}{1+\rho}}(X),\mathcal{M}) and m(mL,mU)m^{*}\in(m_{L},m_{U}) are determined by (3.16).

Theorem 3.8 provides a complete solution to Problem (2.1). It indicates that the optimal contract can not be a pure deductible of the form (xdL)+(x-d_{L})_{+}, nor a pure loss-capped of the form xKUx\wedge K_{U}. It can only be in the form Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} of (3.5) (rather than (3.6)). The optimal policies can be computed by solving a system of three algebraic equations; so the result is semi-analytic.

Actuarially, Theorem 3.8 reveals how the variance bound impacts the contract. When the bound ν\nu is sufficiently low so that it is binding (hence the model does not degenerate into the classical Arrow 1963’s model), the optimal policy is always genuine coinsurance if there is no safety loading. Here, by “genuine” we mean the strict inequalities 0<I(x)<x0<I^{*}(x)<x for all x(0,]x\in(0,\mathcal{M}], namely both the insurer and the insured pay positive portions of the loss incurred. When the safety loading coefficient is positive, the optimal contract demands genuine coinsurance above a positive deductible. So the variance bound translates into a change from the part of the full insurance in Arrow’s contract to coinsurance. Our contracts are similar qualitatively to those of Raviv (1979), in which a utility function is in the place of a variance bound; however, ours are quantitatively different from Raviv (1979)’s.

On the other hand, the deductible d~\tilde{d} is positive if and only if the safety loading coefficient is positive. So the existence of the deductible is completely determined by the loading coefficient in the insurance premium. This result is consistent with Mossin’s Theorem (Mossin 1968).

Corollary 3.9.

Under the assumptions of Theorem 3.8, if the insured is prudent, then the proportion between optimal indemnity and loss increases as loss increases.

So, with a prudent insured, the insurer pays more not only absolutely but also relatively as loss increases. Vajda (1962) restricts his study on a variance contracting problem to policies that have this feature of the insurer covering proportionally more for larger losses. Corollary 3.9 uncovers ex post this feature in our optimal policies, provided that the insured is prudent.

4 Comparative Statics

Thanks to the semi-analytic results derived in the previous section, we are able to analyze the impacts of the insured’s initial wealth and the variance bound on the insurance demand.

We make the following assumptions for our comparative statics analysis:

Assumption 4.1.
  • (i)

    FXF_{X} is strictly increasing on (0,)(0,\mathcal{M}).

  • (ii)

    The insurance is fairly priced, i.e., ρ=0\rho=0.

Assumption 4.1-(i) is standard in the literature that accommodates most of the used distributions by actuaries, such as exponential, lognormal, gamma, and Pareto distributions. Assumption 4.1-(ii) is not necessarily plausible in practice, but it is meaningful in theory, as it describes a state in competitive equilibrium, in which insurers break even and insurance policies are actuarially fair for representative insureds (see e.g., Rothschild and Stiglitz, 1976; Viscusi, 1979). It is important to carry out comparative statics analyses in such a “fair” state in order to rule out any impact emanating from an unfair price. Such an assumption is indeed often imposed when conducting comparative statics in the literature of insurance economics. For example, the comparative statics results of Ehrlich and Becker (1972) and Viscusi (1979) deal exclusively with actuarially fair situations. Many recent studies, such as Eeckhoudt et al. (2003), Huang and Tzeng (2006) and Teh (2017), also impose this assumption for their comparative statics analyses.

Finally, we will assume ν<var[X]\nu<var[X] throughout this section, as otherwise the variance constraint is redundant and the optimal solution is trivially full insurance.

4.1 Impact of the insured’s initial wealth

In this subsection we examine the impact of the insured’s initial wealth on insurance demand. We first recall the notion of one function up-crossing another. A function g1g_{1} is said to up-cross a function g2g_{2}, both defined on {\mathbb{R}}, if there exists z0z_{0}\in{\mathbb{R}} such that

{g1(x)g2(x),x<z0,g1(x)g2(x),xz0.\left\{\begin{array}[]{ll}g_{1}(x)\leqslant g_{2}(x),&\ x<z_{0},\\ g_{1}(x)\geqslant g_{2}(x),&\ x\geqslant z_{0}.\end{array}\right.

Moreover, g1g_{1} is said to up-cross g2g_{2} twice if there exist z0<z1z_{0}<z_{1} such that

{g1(x)g2(x),x<z0,g1(x)g2(x),z0x<z1,g1(x)g2(x),xz1.\left\{\begin{array}[]{ll}g_{1}(x)\leqslant g_{2}(x),&\ x<z_{0},\\ g_{1}(x)\geqslant g_{2}(x),&\ z_{0}\leqslant x<z_{1},\\ g_{1}(x)\leqslant g_{2}(x),&\ x\geqslant z_{1}.\end{array}\right.

Consider two initial wealth levels w1<w2w_{1}<w_{2} and denote the corresponding optimal contracts by I1I_{1}^{*} and I2I_{2}^{*} and the associated parameters by β1\beta^{*}_{1} and β2\beta_{2}^{*}, respectively, which are determined by Theorem 3.8. Recall that ρ=0\rho=0; so the insurer’s risk exposure functions are

eIi(x)=Ii(x)𝔼[Ii(X)]=Ii(x)mi,i=1,2,e_{I_{i}^{*}}(x)=I_{i}^{*}(x)-{\mathbb{E}}[I_{i}^{*}(X)]=I_{i}^{*}(x)-m_{i}^{*},\;\;i=1,2, (4.1)

where mi:=𝔼[Ii(X)]m_{i}^{*}:={\mathbb{E}}[I_{i}^{*}(X)]. Taking expectations on (3.15) yields

U(wimi)=𝔼[U(wiX+eIi(X))]2βimi,U^{\prime}(w_{i}-m_{i}^{*})={\mathbb{E}}[U^{\prime}(w_{i}-X+e_{I_{i}^{*}}(X))]-2\beta_{i}^{*}m_{i}^{*},

which in turn implies, for i=1,2i=1,2,

U(wix+eIi(x))2βieIi(x)𝔼[U(wiX+eIi(X))]=0,\displaystyle U^{\prime}(w_{i}-x+e_{I_{i}^{*}}(x))-2\beta_{i}^{*}e_{I_{i}^{*}}(x)-{\mathbb{E}}[U^{\prime}(w_{i}-X+e_{I_{i}^{*}}(X))]=0, (4.2)
𝔼[eIi(X)]=0and𝔼[(eIi(X))2]=var[Ii(X)]=ν.\displaystyle{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0\quad\text{and}\ \ \ {\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]=var[I_{i}^{*}(X)]=\nu. (4.3)

Note that the insurer’s profit with the contract IiI_{i} is 𝔼[Ii(X)]Ii(X)eIi(X){\mathbb{E}}[I_{i}^{*}(X)]-I_{i}^{*}(X)\equiv-e_{I_{i}^{*}}(X), i=1,2i=1,2. The following theorem establishes the impact of the initial wealth on the insurance contract.

Theorem 4.1.

In addition to Assumption 4.1, we assume that ν<var[X]\nu<var[X] and the insured’s utility function UU exhibits strictly DAP. Then, the insurer’s risk exposure function with the larger initial wealth, eI2(x)e_{I_{2}^{*}}(x), up-crosses the risk exposure function with the smaller initial wealth, eI1(x)e_{I_{1}^{*}}(x), twice. Moreover, the insurer’s profit, eI2(X)-e_{I_{2}^{*}}(X), has less downside risk when contracting with the wealthier insured.

Refer to caption
Figure 4.1: Comparison of two insurer’s risk exposure functions with w1<w2w_{1}<w_{2}.

Figure 4.1 illustrates graphically the first part of Theorem 4.1. The actuarial implication is that when the insured becomes wealthier, the insurer’s risk exposure is lower for large or small losses and is higher for moderate losses. This can be explained intuitively as follows. Even if the insurance pricing is actuarially fair, the insureds are unable to transfer all the risk to the insurer due to the variance bound. However, the wealthier insured is more tolerant with large losses due to the DAP; hence, the insurer’s risk exposure is lower for large losses when contracting with the wealthier insured. Due to the requirement that the insurer’s expected risk exposure be always zero, the insurer’s risk exposure with the wealthier insured must be higher for moderate losses. Now, should the insurer’s risk exposure with the wealthier insured also be higher for small losses, then overall the insurer’s risk exposure with the wealthier insured would be strictly more spread out than that with the less wealthy one, leading to a smaller variance of the former, which would be a contradiction. Hence, the insurer’s risk exposure must be lower for small losses with the wealthier insured.

The second part of the theorem, on the other hand, suggests that a variance minding insurer prefers to provide insurance to a wealthier insured due to the smaller downside risk. Such a finding may shed light on why insurers underwrite relatively more business in developed countries or, in a same country, engage more business when the economy improves.151515For example, Hofmann (2015), an industry report from the insurance company Zurich, shows that both insurance densities (premiums per capita) and insurance penetrations (premiums as a percent of GDP) of advanced economies are much higher than those of emerging economies. This report also demonstrates that insurance markets in both advanced and emerging economies experience rapid growth when the economies grow.

Corollary 4.2.

Under the assumptions of Theorem 4.1, we have the following conclusions:

  • (i)

    𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I_{1}^{*}(X)]<{\mathbb{E}}[I_{2}^{*}(X)] and β2<β1\beta_{2}^{*}<\beta_{1}^{*}.

  • (ii)

    Either I1(x)<I2(x)I^{*}_{1}(x)<I^{*}_{2}(x) x>0\forall x>0, or I1I^{*}_{1} up-crosses I2I^{*}_{2}.

Refer to caption
Figure 4.2: Comparison of two optimal indemnity functions with w1<w2w_{1}<w_{2}.

In part (ii) of this corollary, while the case I1(x)<I2(x)I^{*}_{1}(x)<I^{*}_{2}(x) x>0\forall x>0 is a special case of I1I^{*}_{1} up-crossing I2I^{*}_{2}, we state it separately to highlight its possibility. In the classical Arrow (1963)’s model, full insurance is optimal when insurance pricing is actuarially fair. This conclusion is independent of an insured’s worth. Zhou et al. (2010) show that such a conclusion is intact when there is an exogenous upper limit imposed on the insurer’s risk exposure. Our result yields that adding a variance bound fundamentally changes the insurance demand – it makes the insured’s wealth level relevant, and it changes the way in which the two parties share the risk. Specifically, Corollary 4.2 suggests that a DAP wealthier insured would either demand more coverage across the board or retain more larger risk and cede more smaller risk (see Figure 4.2). Either way, the expected coverage is always larger for the wealthier insured. Recall that insurance is called a normal (inferior) good if wealthier people purchase more (less) insurance converge; see Mossin (1968), Schlesinger (1981) and Gollier (2001). Millo (2016) argues that nonlife insurance is a normal good by empirically testing whether or not income elasticity is significantly greater than one. Armantier et al. (2018) use micro level survey data on households’ insurance coverage to conclude that insurance is a normal good, thereby providing a better understanding of the relationship between insurance demand and economic development. These studies, however, are purely empirical. To the best of our knowledge, ours is the first theoretical result regarding insurance as a normal good under the insurer’s variance constraint, confirming these empirical findings.161616Mossin (1968), Schlesinger (1981) and Gollier (2001) show that a wealthier insured with a DARA preference will cede less risk under unfair insurance pricing; hence, insurance is an inferior good in the corresponding economy. Their results degenerate into full insurance when the pricing is fair, and thus insurance demand is independent of the insured’s wealth. Consequently, our results do not contradict theirs.

4.2 Impact of the variance bound

In this subsection we keep the insured’s initial wealth unchanged and analyze the impact of the variance bound on her demand for insurance. Consider two variance bounds with 0<ν1<ν2<var[X]0<\nu_{1}<\nu_{2}<var[X] and denote the corresponding optimal indemnity functions by I1I_{1}^{*} and I2I_{2}^{*} and the parameters by β1\beta^{*}_{1} and β2\beta_{2}^{*}, respectively. Thus, the insurer’s risk exposures, eIi(x)=Ii(x)𝔼[Ii(X)]e_{I_{i}^{*}}(x)=I_{i}^{*}(x)-{\mathbb{E}}[I_{i}^{*}(X)], i=1,2,i=1,2, satisfy

U(w0x+eIi(x))2βieIi(x)𝔼[U(w0X+eIi(X))]=0\displaystyle U^{\prime}(w_{0}-x+e_{I_{i}^{*}}(x))-2\beta_{i}^{*}e_{I_{i}^{*}}(x)-{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{i}^{*}}(X))]=0 (4.4)

and

𝔼[eIi(X)]=0and𝔼[(eIi(X))2]=var[eIi(X)]=νi.\displaystyle{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0\quad\text{and}\ \ \ {\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]=var[e_{I_{i}^{*}}(X)]=\nu_{i}. (4.5)

The following theorem illustrates how the insurer’s risk exposure responds to the change in the variance bound.

Theorem 4.3.

Under Assumption 4.1, the insurer’s risk exposure function with the larger variance bound, eI2e_{I_{2}^{*}}, up-crosses that with the smaller variance bound, eI1e_{I_{1}^{*}}.

Under fair insurance pricing, this theorem indicates that, as the variance bound decreases, the insurer is exposed to less risk for a larger XX and to more risk for a smaller XX. This result has a rather significant implication in terms of the insurer’s tail risk management. A variance constraint by its very definition does not control the tail risk directly. However, Theorem 4.3 suggests that the insurer can reduce the risk exposure for larger losses simply by tightening the variance constraint.171717It follows from Lemma A.3 that the insurer with a more relaxed variance constrant suffers more underwriting risk in the sense of convex order, i.e., eI1(X)cxeI2(X)e_{I^{*}_{1}}(X)\leqslant_{cx}e_{I^{*}_{2}}(X). This further justifies our formulation of the variance contracting model.

Corollary 4.4.

Under the assumption of Theorem 4.3, for any 0<ν1<ν2<var[X]0<\nu_{1}<\nu_{2}<var[X], we have the following conclusions:

  • (i)

    𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)] and β2<β1\beta_{2}^{*}<\beta_{1}^{*};

  • (ii)

    If the insured’s utility function satisfies U′′′0U^{\prime\prime\prime}\geqslant 0, then I1(x)<I2(x)I_{1}^{*}(x)<I_{2}^{*}(x) x>0.\forall x>0.

Refer to caption
Figure 4.3: Comparison of two optimal indemnity functions with ν1<ν2\nu_{1}<\nu_{2}.

Corollary 4.4-(i) can be easily interpreted: an insurer with a tighter variance bound offers less expected coverage. As a complement to Theorem 4.3, Corollary 4.4-(ii) establishes a direct characterization of the insured’s optimal risk transfer with regard to the change in the variance bound: A prudent insured consistently cedes more losses when the variance bound increases (see Figure 4.3). In other words, if the insurance contract is priced fairly, the insured will transfer as much risk to the insurer as the latter’s risk tolerance allows.

5 Concluding Remarks

In this paper, we have revisited the classical Arrow (1963)’s model by adding a variance limit on the insurer’s risk exposure. This constraint is motivated by the insurer’s desire to manage underwriting risk; at the same time, it poses considerable technical challenges for solving the problem. We have developed an approach to derive optimal contracts semi-analytically, in the form of coinsurance above a deductible when the variance constraint is active. The final policies automatically satisfy the incentive-compatible condition, thereby eliminating potential ex post moral hazard. We have also conducted a comparative analysis to examine the impact of the insured’s wealth and of the variance bound on insurance demand.

This work can be extended in a couple of directions. First, we have restricted the comparative analysis to actuarially fair insurance. Analyzing for the general unfair case calls for a different approach than the one presented here. Second, a model incorporating probability distortion (weighting) is of significant interest, both theoretically and practically. This is because probability distortion, a phenomenon well documented in psychology and behavioral economics, is related to tail events, about which both insurers and insureds have great concerns.



Appendices

Appendix A Stochastic Orders

Since the notion of stochastic orders plays an important role in this paper, we present, in this appendix, some useful results in this regard.

A random variable YY is said to be greater than a random variable ZZ in the sense of stop-loss order, denoted as ZslYZ\leqslant_{sl}Y, if

𝔼[(Zd)+]𝔼[(Yd)+]d,{\mathbb{E}}[(Z-d)_{+}]\leqslant{\mathbb{E}}[(Y-d)_{+}]\;\;\forall d\in\mathbb{R},

provided that the expectations exist. It follows readily that YY is greater than ZZ in convex order (i.e., ZcxYZ\leqslant_{cx}Y), if 𝔼[Y]=𝔼[Z]{\mathbb{E}}[Y]={\mathbb{E}}[Z] and ZslYZ\leqslant_{sl}Y.

A useful way to verify the stop-loss order is the well–known Karlin–Novikoff criterion (Karlin and Novikoff 1963).

Lemma A.1.

Suppose 𝔼[Z]𝔼[Y]<{\mathbb{E}}[Z]\leqslant{\mathbb{E}}[Y]<\infty. If FZF_{Z} up-crosses FYF_{Y}, then ZslYZ\leqslant_{sl}Y.

If ZslYZ\leqslant_{sl}Y, then 𝔼[g(Z)]𝔼[g(Y)]{\mathbb{E}}[g(Z)]\leqslant{\mathbb{E}}[g(Y)] holds for all the increasing convex functions gg, provided that the expectations exist. Based on the Karlin–Novikoff criterion, Gollier and Schlesinger (1996) obtain the following lemma.

Lemma A.2.

For any hh\in\mathfrak{C}, we have Xdcxh(X)X\wedge d\leqslant_{cx}h(X), where d[0,]d\in[0,\mathcal{M}] satisfies 𝔼[Xd]=𝔼[h(X)]{\mathbb{E}}[X\wedge d]={\mathbb{E}}[h(X)].

The following result with respect to convex order is from Lemma 3 of Ohlin (1969).

Lemma A.3.

Let YY be a random variable and hi,i=1,2,h_{i},\;i=1,2, be two increasing functions with 𝔼[h1(Y)]=𝔼[h2(Y)]{\mathbb{E}}[h_{1}(Y)]={\mathbb{E}}[h_{2}(Y)]. If h1h_{1} up-crosses h2h_{2}, then h2(Y)cxh1(Y)h_{2}(Y)\leqslant_{cx}h_{1}(Y).

Appendix B Other Useful Lemmas

This appendix presents some other technical results that are useful in connection with this paper.

It is easy to verify that any sequence of indemnity functions in 𝒞\mathcal{IC} is uniformly bounded and equicontinuous over [0,][0,\mathcal{M}]. Hence, the Arzéla-Ascoli theorem implies

Lemma B.1.

The set 𝒞\mathcal{IC} is compact under the norm d(I1,I2)=maxt[0,]I1(t)I2(t)d(I_{1},I_{2})=\max_{t\in[0,\mathcal{M}]}\mid I_{1}(t)-I_{2}(t)\mid, I1,I2𝒞I_{1},I_{2}\in\mathcal{IC}.

For the following lemma, one can refer to Komiya (1988) for a proof.

Lemma B.2.

(Sion’s Minimax Theorem)  Let Y be a compact convex subset of a linear topological space and Z a convex subset of a linear topological space. If Γ\Gamma is a real-valued function on Y×ZY\times Z such that Γ(y,)\Gamma(y,\cdot) is continuous and concave on ZZ for any yYy\in Y and Γ(,z)\Gamma(\cdot,z) is continuous and convex on YY for any zZz\in Z, then

minyYmaxzZΓ(y,z)=maxzZminyYΓ(y,z).\min\limits_{{y\in Y}}\ \max\limits_{{z\in Z}}\Gamma(y,z)=\max\limits_{{z\in Z}}\ \min\limits_{{y\in Y}}\Gamma(y,z).

The following lemmas are needed in the comparative analysis.

Lemma B.3.

If a non-negative increasing function h1h_{1} up-crosses a non-negative increasing function h2h_{2} with 𝔼[(h1(X))2]=𝔼[(h2(X))2]{\mathbb{E}}[(h_{1}(X))^{2}]={\mathbb{E}}[(h_{2}(X))^{2}], then either h1(X)h_{1}(X) and h2(X)h_{2}(X) have the same distribution or 𝔼[h1(X)]<𝔼[h2(X)]{\mathbb{E}}[h_{1}(X)]<{\mathbb{E}}[h_{2}(X)].

Proof.

If 𝔼[h2(X)]𝔼[h1(X)]{\mathbb{E}}[h_{2}(X)]\leqslant{\mathbb{E}}[h_{1}(X)], then Lemma A.1 implies h2(X)slh1(X)h_{2}(X)\leqslant_{sl}h_{1}(X). Moreover, we have

𝔼[(hi(X))2]=20𝔼[(hi(X)t)+]dt.{\mathbb{E}}[(h_{i}(X))^{2}]=2\int_{0}^{\infty}{\mathbb{E}}[(h_{i}(X)-t)_{+}]{\mathrm{d}}t.

Since it is assumed that 𝔼[(h1(X))2]=𝔼[(h2(X))2]{\mathbb{E}}[(h_{1}(X))^{2}]={\mathbb{E}}[(h_{2}(X))^{2}], we must have 𝔼[(h1(X)t)+]=𝔼[(h2(X)t)+]{\mathbb{E}}[(h_{1}(X)-t)_{+}]={\mathbb{E}}[(h_{2}(X)-t)_{+}] for any t0t\geqslant 0. It then follows from the equation 𝔼[(hi(X)t)+]=t(1Fhi(X)(y))dy{\mathbb{E}}[(h_{i}(X)-t)_{+}]=\int_{t}^{\infty}(1-F_{h_{i}(X)}(y)){\mathrm{d}}y that h1(X)h_{1}(X) and h2(X)h_{2}(X) have the same distribution. ∎

Lemma B.4.

Under the assumption of Theorem 4.1, it is impossible that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) have the same distribution, and it is impossible either eI1e_{I_{1}^{*}} up-crosses eI2e_{I_{2}^{*}} or eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}}, where eIie_{I_{i}^{*}} is given in (4.1).

Proof.

First of all, we show that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) cannot have the same distribution. Define

ϕ(z):=U′′(w1z)U′′(w2z),  0z<w1<w2.\displaystyle\phi(z):=\frac{-U^{\prime\prime}(w_{1}-z)}{-U^{\prime\prime}(w_{2}-z)},\,\ 0\leqslant z<w_{1}<w_{2}. (B.1)

A direct calculation based on the assumption of strictly DAP shows that

ϕ(z)=(𝒫(w1z)𝒫(w2z))ϕ(z)>0,\phi^{\prime}(z)=(\mathcal{P}(w_{1}-z)-\mathcal{P}(w_{2}-z))\phi(z)>0,

where 𝒫\mathcal{P} is defined in (2.2). As a result, ϕ\phi is a strictly increasing function.

We now prove the result by contradiction. Assume that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) are equal in distribution. Noting that eIie_{I_{i}^{*}} is increasing and Lipschitz-continuous and that FX(x)F_{X}(x) is strictly increasing, we have

eI1(x)=eI2(x),x[0,],e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x),\,\forall x\in[0,\mathcal{M}],

which in turn implies eI1(x)=eI2(x).e_{I_{1}^{*}}^{\prime}(x)=e_{I_{2}^{*}}^{\prime}(x). It follows from (4.2) that eIi(x)=11+2βi/(U′′(wix+eIi(x))e_{I_{i}^{*}}^{\prime}(x)=\frac{1}{1+2\beta_{i}^{*}/(-U^{\prime\prime}(w_{i}-x+e_{I_{i}^{*}}(x))} for all x0x\geqslant 0; hence,

2β1U′′(w1x+eI1(x))=2β2U′′(w2x+eI2(x)).\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{1}-x+e_{I_{1}^{*}}(x))}=\frac{2\beta_{2}^{*}}{-U^{\prime\prime}(w_{2}-x+e_{I_{2}^{*}}(x))}.

Because eI1(x)=eI2(x)e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x) for all x[0,]x\in[0,\mathcal{M}], we obtain

ϕ(xeI1(x))=U′′(w1x+eI1(x))U′′(w2x+eI1(x))=β1β2,x[0,],\phi(x-e_{I_{1}^{*}}(x))=\frac{-U^{\prime\prime}(w_{1}-x+e_{I_{1}^{*}}(x))}{-U^{\prime\prime}(w_{2}-x+e_{I_{1}^{*}}(x))}=\frac{\beta_{1}^{*}}{\beta_{2}^{*}},\,\forall x\in[0,\mathcal{M}],

which contradicts the fact that xeI1(x)xI1(x)+𝔼[I1(X)]x-e_{I_{1}^{*}}(x)\equiv x-I_{1}^{*}(x)+{\mathbb{E}}[I_{1}^{*}(X)] is strictly increasing in x[0,]x\in[0,\mathcal{M}] and that ϕ\phi is a strictly increasing function.

Next we show that it is impossible that eI1e_{I_{1}^{*}} up-crosses eI2e_{I_{2}^{*}}. Again we prove the result by contradiction. Since eI1e_{I_{1}^{*}} is not always non-negative, we introduce the increasing function

f~i(x):=eIi(x)+m~,i=1,2,\tilde{f}_{i}^{*}(x):=e_{I_{i}^{*}}(x)+\widetilde{m},\;\;i=1,2,

where m~:=max{𝔼[I1(X)],𝔼[I2(X)]}\widetilde{m}:=\max\left\{{\mathbb{E}}[I_{1}^{*}(X)],{\mathbb{E}}[I_{2}^{*}(X)]\right\}. It then follows that f~1\tilde{f}_{1}^{*} up-crosses f~2\tilde{f}_{2}^{*}, and

f~i(x)0,𝔼[f~i(X)]=m~,𝔼[(f~i(X))2]=𝔼[(eIi(X))2]+m~2=ν+m~2,i=1,2,\tilde{f}_{i}^{*}(x)\geqslant 0,\quad{\mathbb{E}}[\tilde{f}_{i}^{*}(X)]=\widetilde{m},\quad{\mathbb{E}}[(\tilde{f}_{i}^{*}(X))^{2}]={\mathbb{E}}[(e_{I_{i}^{*}}(X))^{2}]+\widetilde{m}^{2}=\nu+\widetilde{m}^{2},\;\;i=1,2,

where the second equality follows from the fact that 𝔼[eIi(X)]=0{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0. Lemma B.3 yields that f~1(X)\tilde{f}_{1}^{*}(X) and f~2(X)\tilde{f}_{2}^{*}(X) have the same distribution, and hence so do eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X). As shown above, eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) cannot have the same distribution, and therefore eI1e_{I_{1}^{*}} cannot up-cross eI2e_{I_{2}^{*}}. A similar analysis shows that it is impossible that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}}. The proof is thus complete. ∎

Appendix C Proofs

Proof of Lemma 3.2:

For any II\in\mathfrak{C} with 𝔼[I(X)]mL{\mathbb{E}}[I(X)]\leqslant m_{L}, it follows from Lemma A.2 that

XdIcxRI(X),X\wedge d_{I}\leqslant_{cx}R_{I}(X),

where dIdLd_{I}\geqslant d_{L} is determined by 𝔼[(XdI)+]=𝔼[I(X)]{\mathbb{E}}[(X-d_{I})_{+}]={\mathbb{E}}[I(X)] (or, equivalently, 𝔼[XdI]=𝔼[RI(X)]{\mathbb{E}}[X\wedge d_{I}]={\mathbb{E}}[R_{I}(X)]). Thus, we have 𝔼[U(WI(X))]𝔼[U(W(xdI)+(X))]{\mathbb{E}}[U(W_{I}(X))]\leqslant{\mathbb{E}}[U(W_{(x-d_{I})_{+}}(X))]. Furthermore, according to the proof of Theorem 4.2 in Chi (2019), 𝔼[U(W(xd)+(X))]{\mathbb{E}}[U(W_{(x-d)_{+}}(X))] is a decreasing function of dd over [d,)[d^{*},\mathcal{M}), where dd^{*} is defined in (3.2). Recalling that dLdd_{L}\geqslant d^{*}, we conclude that II is no better than (xdL)+(x-d_{L})_{+}.

Proof of Lemma 3.3:

We prove by contradiction. Assume 𝔼[I(X)]>mU{\mathbb{E}}[I(X)]>m_{U} for some indemnity function II\in\mathfrak{C} satisfying var[I(X)]νvar[I(X)]\leqslant\nu. Lemma A.2 implies that there exists K>KUK>K_{U} such that

XKcxI(X).X\wedge K\leqslant_{cx}I(X).

Noting that var[XK]var[X\wedge K] is strictly increasing and continuous in K[KU,]K\in[K_{U},\mathcal{M}], we obtain

ν=var[XKU]<var[XK]var[I(X)]ν,\nu=var[X\wedge K_{U}]<var[X\wedge K]\leqslant var[I(X)]\leqslant\nu,

leading to a contradiction.

For any II\in\mathfrak{C} satisfying var[I(X)]νvar[I(X)]\leqslant\nu and 𝔼[I(X)]=mU{\mathbb{E}}[I(X)]=m_{U}, it follows from Lemma A.2 that

XKUcxI(X),X\wedge K_{U}\leqslant_{cx}I(X),

which in turn implies

ν=var[XKU]var[I(X)]ν.\nu=var[X\wedge K_{U}]\leqslant var[I(X)]\leqslant\nu.

Because

var[Z]=2{𝔼[(Zt)+](𝔼[Z]t)+}dtvar[Z]=2\int_{-\infty}^{\infty}\big{\{}{\mathbb{E}}[(Z-t)_{+}]-({\mathbb{E}}[Z]-t)_{+}\big{\}}{\mathrm{d}}t

for any random variable ZZ with a finite second moment, we deduce 𝔼[(XKUt)+]=𝔼[(I(X)t)+]{\mathbb{E}}[(X\wedge K_{U}-t)_{+}]={\mathbb{E}}[(I(X)-t)_{+}] almost everywhere in tt. Therefore, XKUX\wedge K_{U} and I(X)I(X) are equally distributed, which implies (I(X)KU)=1\mathbb{P}(I(X)\leqslant K_{U})=1. Moreover, it follows from II\in\mathfrak{C} that (I(X)XKU)=1\mathbb{P}(I(X)\leqslant X\wedge K_{U})=1. Since 𝔼[I(X)]=𝔼[XKU]{\mathbb{E}}[I(X)]={\mathbb{E}}[X\wedge K_{U}], we obtain that I(X)=XKUI(X)=X\wedge K_{U} almost surely. The proof is thus complete.

Proof of Proposition 3.4:

Because mL=mUm_{L}=m_{U}, it follows from Lemma 3.3 that XKU=(XdL)+X\wedge K_{U}=(X-d_{L})_{+} almost surely. Thus, it follows from the fact that dL>0d_{L}>0 that XX must follow a Bernoulli distribution with values 0 and dL+KUd_{L}+K_{U}. Furthermore, Lemmas 3.2 and 3.3 imply that any admissible insurance policy is no better than (xdL)+(x-d_{L})_{+}. Therefore, the optimal indemnity must be KUK_{U} at point dL+KUd_{L}+K_{U}.

Proof of Proposition 3.5:

Introducing two Lagrangian multipliers λ\lambda\in{\mathbb{R}} and β0\beta\geqslant 0, we consider the following maximization problem:

maxIUλ,β(I):=𝔼[U(w0X+I(X)(1+ρ)m)λ(𝔼[I(X)]m)β(𝔼[I2(X)]νm2)].\max_{I\in\mathfrak{C}}U_{\lambda,\beta}(I):={\mathbb{E}}\Big{[}U(w_{0}-X+I(X)-(1+\rho)m)-\lambda({\mathbb{E}}[I(X)]-m)-\beta({\mathbb{E}}[I^{2}(X)]-\nu-m^{2})\Big{]}. (C.1)

Fix x0x\geqslant 0. The above objective function motivates the introduction of the following function:

Ψ(y):=U(w0x+y(1+ρ)m)λyβy2, 0yx.\Psi(y):=U(w_{0}-x+y-(1+\rho)m)-\lambda y-\beta y^{2},\,0\leqslant y\leqslant x.

The assumption of U′′<0U^{\prime\prime}<0 implies that Ψ\Psi is strictly concave with

Ψ(y)=U(w0x+y(1+ρ)m)λ2βy.\Psi^{\prime}(y)=U^{\prime}(w_{0}-x+y-(1+\rho)m)-\lambda-2\beta y.

As a consequence, for each x0x\geqslant 0, Iλ,β(x)I_{\lambda,\beta}(x) defined in (3.4) is an optimal solution to

max0yxΨ(y).\max_{0\leqslant y\leqslant x}\Psi(y).

This result, together with the fact that Iλ,βI_{\lambda,\beta}\in\mathfrak{C}, implies that Iλ,βI_{\lambda,\beta} solves Problem (C.1).

Notably, if there exist λm\lambda^{*}_{m}\in{\mathbb{R}} and βm>0\beta^{*}_{m}>0 such that

𝔼[Iλm,βm(X)]=mandvar[Iλm,βm(X)]=ν,{\mathbb{E}}[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=m\qquad\text{and}\qquad var[I_{\lambda^{*}_{m},\beta^{*}_{m}}(X)]=\nu, (C.2)

then Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} solves Problem (3.3). We prove this by contradiction. Indeed, if there exists ImI_{*}\in\mathfrak{C}_{m} such that 𝔼[U(WI(X))]>𝔼[U(WIλm,βm(X))]{\mathbb{E}}[U(W_{I_{*}}(X))]>{\mathbb{E}}[U(W_{I_{\lambda^{*}_{m},\beta^{*}_{m}}}(X))], then we can obtain

Uλm,βm(I)>Uλm,βm(Iλm,βm),U_{\lambda^{*}_{m},\beta^{*}_{m}}(I_{*})>U_{\lambda^{*}_{m},\beta^{*}_{m}}(I_{\lambda^{*}_{m},\beta^{*}_{m}}),

which contradicts the fact that Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} solves Problem (C.1) with λ=λm\lambda=\lambda^{*}_{m} and β=βm.\beta=\beta^{*}_{m}. Therefore, we only need to show the existence of λm\lambda^{*}_{m} and βm\beta^{*}_{m}.

It follows from (3.5)-(3.8) that Iλ,βI_{\lambda,\beta} satisfies the incentive-compatible condition, i.e., Iλ,β𝒞I_{\lambda,\beta}\in\mathcal{IC}. Such an observation motivates us to consider an auxiliary problem

maxI𝒞m𝔼[U(WI(X))],\max_{I\in\mathcal{IC}_{m}}\ \ {\mathbb{E}}[U(W_{I}(X))], (C.3)

where

𝒞m:={I(x)𝒞:var[I(X)]ν,𝔼[I(X)]=m}.\mathcal{IC}_{m}:=\left\{I(x)\in\mathcal{IC}:var[I(X)]\leqslant\nu,\,{\mathbb{E}}[I(X)]=m\right\}.

Problem (C.3) differs from Problem (3.3) in that the feasible set is 𝒞m\mathcal{IC}_{m} instead of m\mathfrak{C}_{m}. In what follows, we show that 𝒞m\mathcal{IC}_{m}\neq\emptyset, and that there exists a unique optimal solution ImI^{*}_{m} to Problem (C.3) satisfying var[Im(X)]=νvar[I^{*}_{m}(X)]=\nu. Indeed, for any m(mL,mU)m\in(m_{L},m_{U}), let θ(0,1)\theta\in(0,1) be such that m=θmU+(1θ)mLm=\theta m_{U}+(1-\theta)m_{L}. Denote

Iθ(x):=θ(xKU)+(1θ)(xdL)+.I_{\theta}(x):=\theta(x\wedge K_{U})+(1-\theta)(x-d_{L})_{+}.

It follows that 𝔼[Iθ(X)]=m{\mathbb{E}}[I_{\theta}(X)]=m and

var[Iθ(X)]θvar[XKU]+(1θ)var[(XdL)+]=ν.\sqrt{var[I_{\theta}(X)]}\leqslant\theta\sqrt{var[X\wedge K_{U}]}+(1-\theta)\sqrt{var[(X-d_{L})_{+}]}=\sqrt{\nu}.

As a consequence, Iθ𝒞mI_{\theta}\in\mathcal{IC}_{m}; and hence 𝒞m\mathcal{IC}_{m} is nonempty. Moreover, note that there must exist dm[0,dL)d_{m}\in[0,d_{L}) such that 𝔼[(Xdm)+]=m{\mathbb{E}}[(X-d_{m})_{+}]=m. For any I𝒞mI\in\mathcal{IC}_{m}, define

I~α(x):=αI(x)+(1α)(xdm)+,α[0,1].\tilde{I}_{\alpha}(x):=\alpha I(x)+(1-\alpha)(x-d_{m})_{+},\;\alpha\in[0,1].

Then, we have 𝔼[I~α(X)]=m{\mathbb{E}}[\tilde{I}_{\alpha}(X)]=m and

var[I~1(X)]=var[I(X)]ν=var[(XdL)+]var[(Xdm)+]=var[I~0(X)],var[\tilde{I}_{1}(X)]=var[I(X)]\leqslant\nu=var[(X-d_{L})_{+}]\leqslant var[(X-d_{m})_{+}]=var[\tilde{I}_{0}(X)],

where the last inequality follows from the fact that var[(Xd)+]var[(X-d)_{+}] is decreasing in dd. Because var[I~α(X)]var[\tilde{I}_{\alpha}(X)] is continuous in α\alpha, there must exist α[0,1]\alpha^{*}\in[0,1] such that var[I~α(X)]=νvar[\tilde{I}_{\alpha^{*}}(X)]=\nu. Lemma A.2 yields that

XdmcxXI(X),X\wedge d_{m}\leqslant_{cx}X-I(X),

leading to

𝔼[U(WI~α(X))]α𝔼[U(WI(X))]+(1α)𝔼[U(W(Xdm)+(X))]𝔼[U(WI(X))].{\mathbb{E}}[U(W_{\tilde{I}_{\alpha^{*}}}(X))]\geqslant\alpha^{*}{\mathbb{E}}[U(W_{I}(X))]+(1-\alpha^{*}){\mathbb{E}}[U(W_{(X-d_{m})_{+}}(X))]\geqslant{\mathbb{E}}[U(W_{I}(X))].

This means that II is no better than I~α\tilde{I}_{\alpha^{*}}. Together with the Arzéla-Ascoli theorem, the above analysis implies that there exists an optimal solution to Problem (C.3) that binds the variance constraint. Finally, a similar argument to the proof of Proposition 3.1 in Chi and Wei (2020) further shows that the optimal solution to Problem (C.3) must be unique.

By defining U(λ,β):=Uλ,β(Iλ,β)U^{*}(\lambda,\beta):=U_{\lambda,\beta}(I_{\lambda,\beta}) and for any α[0,1]\alpha\in[0,1], we have

U(αλ1+(1α)λ2,αβ1+(1α)β2)\displaystyle U^{*}\big{(}\alpha\lambda_{1}+(1-\alpha)\lambda_{2},\alpha\beta_{1}+(1-\alpha)\beta_{2}\big{)} =\displaystyle= maxI𝒞Uαλ1+(1α)λ2,αβ1+(1α)β2(I)\displaystyle\max\limits_{{I\in\mathcal{IC}}}U_{\alpha\lambda_{1}+(1-\alpha)\lambda_{2},\alpha\beta_{1}+(1-\alpha)\beta_{2}}(I)
=\displaystyle= maxI𝒞{αUλ1,β1(I)+(1α)Uλ2,β2(I)}\displaystyle\max\limits_{{I\in\mathcal{IC}}}\big{\{}\alpha U_{\lambda_{1},\beta_{1}}(I)+(1-\alpha)U_{\lambda_{2},\beta_{2}}(I)\big{\}}
\displaystyle\leqslant maxI𝒞{αUλ1,β1(I)}+maxI𝒞{(1α)Uλ2,β2(I)}\displaystyle\max\limits_{{I\in\mathcal{IC}}}\big{\{}\alpha U_{\lambda_{1},\beta_{1}}(I)\}+\max\limits_{{I\in\mathcal{IC}}}\{(1-\alpha)U_{\lambda_{2},\beta_{2}}(I)\big{\}}
=\displaystyle= αmaxI𝒞{Uλ1,β1(I)}+(1α)maxI𝒞{Uλ2,β2(I)}\displaystyle\alpha\max\limits_{{I\in\mathcal{IC}}}\big{\{}U_{\lambda_{1},\beta_{1}}(I)\big{\}}+(1-\alpha)\max\limits_{{I\in\mathcal{IC}}}\big{\{}U_{\lambda_{2},\beta_{2}}(I)\big{\}}
=\displaystyle= αU(λ1,β1)+(1α)U(λ2,β2),\displaystyle\alpha U^{*}(\lambda_{1},\beta_{1})+(1-\alpha)U^{*}(\lambda_{2},\beta_{2}),

where the second equality is due to the fact that Uλ,β(I)U_{\lambda,\beta}(I) is linear in (λ,β)(\lambda,\beta) for any given I𝒞I\in\mathcal{IC}. Thus, U(λ,β)U^{*}(\lambda,\beta) is convex in (λ,β)(\lambda,\beta).

Furthermore, denoting by UU^{**} the maximal EU value of the insured’s final wealth in Problem (C.3), we have UU(w0(1+ρ)m)<U^{**}\leqslant U(w_{0}-(1+\rho)m)<\infty. On the other hand, for any given I𝒞I\in\mathcal{IC} satisfying 𝔼[I(X)]m{\mathbb{E}}[I(X)]\neq m or var[I(X)]>νvar[I(X)]>\nu, it is easy to show that minλ,β0Uλ,β(I)=\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)=-\infty. Noting that

U0,0(I)=𝔼[U(w0X+I(X)(1+ρ)m)],U_{0,0}(I)={\mathbb{E}}[U(w_{0}-X+I(X)-(1+\rho)m)],

we have

maxI𝒞minλ,β0Uλ,β(I)U.\max\limits_{{I\in\mathcal{IC}}}\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)\leqslant U^{**}.

Now,

maxI𝒞Uλ,β(I)maxI𝒞mUλ,β(I)maxI𝒞m𝔼[U(w0X+I(X)(1+ρ)m)],λ,β0,\max_{I\in\mathcal{IC}}U_{\lambda,\beta}(I)\geqslant\max_{I\in\mathcal{IC}_{m}}U_{\lambda,\beta}(I)\geqslant\max_{I\in\mathcal{IC}_{m}}{\mathbb{E}}[U(w_{0}-X+I(X)-(1+\rho)m)],\;\forall\lambda\in{\mathbb{R}},\;\beta\geqslant 0,

leading to Uminλ,β0maxI𝒞Uλ,β(I)U^{**}\leqslant\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I). Since Uλ,β(I)U_{\lambda,\beta}(I) is continuous and strictly concave in II, we can obtain from Lemmas B.1 and B.2 that

minI𝒞maxλ,β0Uλ,β(I)=maxλ,β0minI𝒞Uλ,β(I),\min\limits_{{I\in\mathcal{IC}}}\max\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}-U_{\lambda,\beta}(I)=\max\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\min\limits_{{I\in\mathcal{IC}}}-U_{\lambda,\beta}(I),

which implies

maxI𝒞minλ,β0Uλ,β(I)=minλ,β0maxI𝒞Uλ,β(I)=U.\max\limits_{{I\in\mathcal{IC}}}\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U_{\lambda,\beta}(I)=\min\limits_{\lambda\in{\mathbb{R}},\beta\geqslant 0}\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)=U^{**}.

By denoting λU:=(U+1)/m\lambda_{U}:=(U^{**}+1)/m, we have

U(λ,β)=maxI𝒞Uλ,β(I)Uλ,β(0)λUm=U+1,λλU,β0.U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(0)\geqslant\lambda_{U}m=U^{**}+1,\;\;\forall\lambda\geqslant\lambda_{U},\;\beta\geqslant 0.

Furthermore, we define λL:=(U+1)/(𝔼[XK1]m)\lambda_{L}:=-(U^{**}+1)/\left({\mathbb{E}}[X\wedge K_{1}]-m\right), where K1K_{1} is determined by 𝔼[(XK1)2]=ν+m2<𝔼[(XKU)2]{\mathbb{E}}\left[(X\wedge K_{1})^{2}\right]=\nu+m^{2}<{\mathbb{E}}\left[(X\wedge K_{U})^{2}\right]. Clearly, K1(0,KU)K_{1}\in(0,K_{U}). If 𝔼[XK1]m{\mathbb{E}}[X\wedge K_{1}]\leqslant m, then var[XK1]ν=var[XKU]var[X\wedge K_{1}]\geqslant\nu=var[X\wedge K_{U}], which contradicts the definition of KUK_{U}. Thus, we must have 𝔼[XK1]>m{\mathbb{E}}[X\wedge K_{1}]>m, and hence

U(λ,β)=maxI𝒞Uλ,β(I)Uλ,β(xK1)λL(𝔼[XK1]m)U+1,λλL,β0.U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(x\wedge K_{1})\geqslant-\lambda_{L}\left({\mathbb{E}}[X\wedge K_{1}]-m\right)\geqslant U^{**}+1,\;\;\forall\lambda\leqslant\lambda_{L},\;\beta\geqslant 0.

Similarly, let βU:=U+1νvar[XK2]\beta_{U}:=\frac{U^{**}+1}{\nu-var[X\wedge K_{2}]}, where K2K_{2} is determined by 𝔼[XK2]=m{\mathbb{E}}[X\wedge K_{2}]=m. Here, we have K2<KUK_{2}<K_{U}, which in turn implies var[XK2]<νvar[X\wedge K_{2}]<\nu based on the definition of KUK_{U}. For any ββU\beta\geqslant\beta_{U} and λ\lambda\in{\mathbb{R}}, we obtain

U(λ,β)=maxI𝒞Uλ,β(I)Uλ,β(xK2)βU(νvar[XK2])=U+1.U^{*}(\lambda,\beta)=\max\limits_{{I\in\mathcal{IC}}}U_{\lambda,\beta}(I)\geqslant U_{\lambda,\beta}(x\wedge K_{2})\geqslant\beta_{U}(\nu-var[X\wedge K_{2}])=U^{**}+1.

The above analysis indicates that

U=minλLλλU0ββUU(λ,β).U^{**}=\min_{\lambda_{L}\leqslant\lambda\leqslant\lambda_{U}\atop 0\leqslant\beta\leqslant\beta_{U}}U^{*}(\lambda,\beta).

Thus, it follows from the convexity of U(λ,β)U^{*}(\lambda,\beta) and Weierstrass’s theorem that there exist λm[λL,λU]\lambda^{*}_{m}\in[\lambda_{L},\lambda_{U}] and βm[0,βU]\beta^{*}_{m}\in[0,\beta_{U}] such that

U=U(λm,βm)=minλ,β0U(λ,β).U^{**}=U^{*}(\lambda^{*}_{m},\beta^{*}_{m})=\min_{\lambda\in{\mathbb{R}},\beta\geqslant 0}U^{*}(\lambda,\beta).

Moreover, we have

U(λm,βm)=maxI𝒞Uλm,βm(I)maxI𝒞m,var[I(X)]=νUλm,βm(I)=U,\displaystyle U^{*}(\lambda^{*}_{m},\beta^{*}_{m})=\max_{I\in\mathcal{IC}}U_{\lambda^{*}_{m},\beta^{*}_{m}}(I)\geqslant\max_{I\in\mathcal{IC}_{m},var[I(X)]=\nu}U_{\lambda^{*}_{m},\beta^{*}_{m}}(I)=U^{**}, (C.4)

where the second equality is derived from the fact that the optimal solution to Problem (C.3) binds the variance constraint. In addition, thanks to (C.4), the unique optimal solution ImI^{*}_{m} of Problem (C.3) must solve Problem (C.1) with λ=λm\lambda=\lambda^{*}_{m} and β=βm\beta=\beta^{*}_{m}. Note that Uλm,βm(I)U_{\lambda^{*}_{m},\beta^{*}_{m}}(I) is strictly concave in II; therefore, Im(X)=Iλm,βm(X)I^{*}_{m}(X)=I_{\lambda^{*}_{m},\beta^{*}_{m}}(X) almost surely. As a result, Iλm,βmI_{\lambda^{*}_{m},\beta^{*}_{m}} satisfies (C.2) and must be a solution to Problem (C.3).

Finally, we show that βm>0\beta^{*}_{m}>0. Otherwise, if βm=0\beta^{*}_{m}=0, then Iλm,βm(x)=(xdm)+I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=(x-d_{m})_{+}, yielding a contradiction

ν=var[(Xdm)+]>var[(XdL)+]=ν.\nu=var[(X-d_{m})_{+}]>var[(X-d_{L})_{+}]=\nu.

The proof is thus complete.

Proof of Proposition 3.7:

Denote

Lβ(I):=𝔼[U(w0X+I(X)(1+ρ)𝔼[I(X)])]β(var[I(X)]ν),β0,I𝒞,L_{\beta}(I):={\mathbb{E}}\big{[}U\big{(}w_{0}-X+I(X)-(1+\rho){\mathbb{E}}[I(X)]\big{)}\big{]}-\beta\big{(}var[I(X)]-\nu\big{)},\;\beta\geqslant 0,\;I\in\mathcal{IC},

which is linear in β\beta and concave in II, because U′′()<0U^{\prime\prime}(\cdot)<0 and var[I(X)]var[I(X)] is convex in II.

We denote by UU^{***} the maximum EU value of the insured’s final wealth in Problem (3.11). Using an argument similar to that in the proof of Proposition 3.5, we have

maxI𝒞minβ0Lβ(I)Uminβ0maxI𝒞Lβ(I),\max_{I\in\mathcal{IC}}\min_{\beta\geqslant 0}L_{\beta}(I)\leqslant U^{***}\leqslant\min_{\beta\geqslant 0}\max_{I\in\mathcal{IC}}L_{\beta}(I),

which, together with Lemma B.2, implies

U=minβ0maxI𝒞Lβ(I).U^{***}=\min_{\beta\geqslant 0}\max_{I\in\mathcal{IC}}L_{\beta}(I).

Denoting β~=U+1ν\tilde{\beta}=\frac{U^{***}+1}{\nu}, we have

maxI𝒞Lβ(I)Lβ(0)βνU+1,ββ~.\max_{I\in\mathcal{IC}}L_{\beta}(I)\geqslant L_{\beta}(0)\geqslant\beta\nu\geqslant U^{***}+1,\;\;\forall\beta\geqslant\tilde{\beta}.

Furthermore, since maxI𝒞Lβ(I)\max_{I\in\mathcal{IC}}L_{\beta}(I) is convex in β\beta, there must exist β[0,β~]\beta^{*}\in[0,\tilde{\beta}] such that

U=maxI𝒞Lβ(I).U^{***}=\max_{I\in\mathcal{IC}}L_{\beta^{*}}(I).

If β=0\beta^{*}=0, then

U=maxI𝒞𝔼[U(WI(X))],U^{***}=\max_{I\in\mathcal{IC}}{\mathbb{E}}[U(W_{I}(X))],

in which case the stop-loss insurance (xd)+(x-d^{*})_{+} is optimal, where dd^{*} is defined in (3.2). However, this is contradicted by Assumption 3.1. So we must have β>0\beta^{*}>0.

According to Lemma 3.2, Lemma 3.3 and Proposition 3.5, II^{*}, which solves Problem (2.1) under Assumption 3.1, satisfies var[I(X)]=νvar[I^{*}(X)]=\nu and therefore also solves the maximization problem

maxI𝒞Lβ(I).\max_{I\in\mathcal{IC}}L_{\beta^{*}}(I).

Thus, for any I𝒞I\in\mathcal{IC}, we have

limα0Lβ((1α)I(x)+αI(x))Lβ(I(x))α0,\lim_{\alpha\downarrow 0}\frac{L_{\beta^{*}}\left((1-\alpha)I^{*}(x)+\alpha I(x)\right)-L_{\beta^{*}}(I^{*}(x))}{\alpha}\leqslant 0,

leading to

0\displaystyle 0\geqslant 𝔼[U(WI(X))(I(X)I(X)(1+ρ)𝔼[I(X)I(X)])]2β(cov(I(X),I(X))var[I(X)])\displaystyle{\mathbb{E}}\Big{[}U^{\prime}(W_{I^{*}}(X))\big{(}I(X)-I^{*}(X)-(1+\rho){\mathbb{E}}[I(X)-I^{*}(X)]\big{)}\Big{]}-2\beta^{*}\big{(}cov(I^{*}(X),I(X))-var[I^{*}(X)]\big{)}
=\displaystyle= 0(𝔼[U(WI(X))(𝕀{X>t}(1+ρ){X>t})]\displaystyle\int_{0}^{\infty}\Big{(}{\mathbb{E}}\big{[}U^{\prime}(W_{I^{*}}(X))(\mathbb{I}_{\{X>t\}}-(1+\rho){\mathbb{P}}\{X>t\})\big{]}
2β(𝔼[I(X)𝕀{X>t}]𝔼[I(X)]{X>t}))(I(t)I(t))dt\displaystyle\ \ \ \ \ \ \ -2\beta^{*}\big{(}{\mathbb{E}}[I^{*}(X)\mathbb{I}_{\{X>t\}}]-{\mathbb{E}}[I^{*}(X)]{\mathbb{P}}\{X>t\}\big{)}\Big{)}\big{(}I^{\prime}(t)-{I^{*}}^{\prime}(t)\big{)}{\mathrm{d}}t
=\displaystyle= 0(𝔼[U(WI(X))2βI(X)|X>t]\displaystyle\int_{0}^{\mathcal{M}}\Big{(}{\mathbb{E}}\big{[}U^{\prime}(W_{I^{*}}(X))-2\beta^{*}I^{*}(X)|X>t\big{]}
(1+ρ)𝔼[U(WI(X))]+2β𝔼[I(X)]){X>t}(I(t)I(t))dt\displaystyle\ \ \ \ \ \ \ -(1+\rho){\mathbb{E}}\big{[}U^{\prime}(W_{I^{*}}(X))]+2\beta^{*}{\mathbb{E}}[I^{*}(X)\big{]}\Big{)}{\mathbb{P}}\{X>t\}\big{(}I^{\prime}(t)-{I^{*}}^{\prime}(t)\big{)}{\mathrm{d}}t
=\displaystyle= 0ΦI(t){X>t}(I(t)I(t))dt,\displaystyle\int_{0}^{\mathcal{M}}\Phi_{I^{*}}(t){\mathbb{P}}\{X>t\}\big{(}I^{\prime}(t)-{I^{*}}^{\prime}(t)\big{)}{\mathrm{d}}t,

where ΦI\Phi_{I^{*}} is defined in (3.14) and 𝕀A\mathbb{I}_{A} is the indicator function of an event AA. The arbitrariness of I𝒞I\in\mathcal{IC} and the fact that {X>t}>0{\mathbb{P}}\{X>t\}>0 for any t<t<\mathcal{M} yield that I{I^{*}}^{\prime} should be in the form of (3.13). The proof is complete.

Proof of Theorem 3.8:

Let II^{*} be optimal for Problem (2.1). Then it follows from Proposition 3.7 that

ΦI(x)=𝔼[ψ(X)|X>x]\displaystyle\Phi_{I^{*}}(x)={\mathbb{E}}\big{[}\psi(X)|X>x\big{]} (C.5)

where

ψ(x):=U(WI(x))2βI(x)((1+ρ)𝔼[U(WI(X))]2β𝔼[I(X)]),\psi(x):=U^{\prime}(W_{I^{*}}(x))-2\beta^{*}I^{*}(x)-\Big{(}(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]-2\beta^{*}{\mathbb{E}}[I^{*}(X)]\Big{)}, (C.6)

for some β>0\beta^{*}>0.

Since FXF_{X} is assumed to be strictly increasing on (0,)(0,\mathcal{M}), it follows from Proposition 3.4 that mL<mUm_{L}<m_{U}. Recall that solving Problem (2.1) can be reduced to solving Problem (3.10) under Assumption 3.1. In the following, we proceed to solve Problem (3.10) with the help of Proposition 3.7. We carry out the analysis for three cases:

  1. Case (A)

    If I(x)=(xdL)+I^{*}(x)=(x-d_{L})_{+}, then ψ\psi is strictly increasing on [0,dL][0,d_{L}] and strictly decreasing on [dL,)[d_{L},\mathcal{M}). If there exists x~[dL,)\tilde{x}\in[d_{L},\mathcal{M}) such that ψ(x~)<0\psi(\tilde{x})<0, then ψ(x)<0\psi(x)<0 and thus ΦI(x)<0\Phi_{I^{*}}(x)<0 x[x~,),\forall x\in[\tilde{x},\mathcal{M}), which contradicts Proposition 3.7. Therefore, ψ(x)0\psi(x)\geqslant 0 x[dL,)\forall x\in[d_{L},\mathcal{M}) and ψ(dL)>0\psi(d_{L})>0. Because ψ(x)\psi(x) is continuous in xx and FXF_{X} is strictly increasing on (0,)(0,\mathcal{M}), there must exist ϵ>0\epsilon>0 such that ΦI(x)>0\Phi_{I^{*}}(x)>0 x[dLϵ,dL)\forall x\in[d_{L}-\epsilon,d_{L}), contradicting Proposition 3.7. So (xdL)+(x-d_{L})_{+} cannot be an optimal solution to Problem (3.10).

  2. Case (B)

    If I(x)=xKUI^{*}(x)=x\wedge K_{U}, then ψ\psi is strictly decreasing on [0,KU][0,K_{U}] and strictly increasing on [KU,)[K_{U},\mathcal{M}). Using a similar argument as the one for Case (A), we deduce ψ(x)0\psi(x)\leqslant 0 x[KU,)\forall x\in[K_{U},\mathcal{M}) and ψ(KU)<0\psi(K_{U})<0. Since ψ\psi is a continuous function, there exists ϵ>0\epsilon>0 such that ΦI(x)<0\Phi_{I^{*}}(x)<0 x[KUϵ,KU)\forall x\in[K_{U}-\epsilon,K_{U}), contradicting Proposition 3.7. As a result, xKUx\wedge K_{U} cannot be optimal to Problem (3.10) either.

  3. Case (C)

    Hence, the optimal solution must be of the form I(x)=Iλm,βm(x)I^{*}(x)=I_{\lambda^{*}_{m},\beta^{*}_{m}}(x) for some m(mL,mU)m\in(m_{L},m_{U}). Noting that Iλm,βm(x)(0,1)I_{\lambda^{*}_{m},\beta^{*}_{m}}^{\prime}(x)\in(0,1) for sufficiently large xx due to βm>0\beta^{*}_{m}>0 and (3.8), we deduce from Proposition 3.7 that ΦI(x)=0\Phi_{I^{*}}(x)=0 for sufficiently large xx. Thus, together with (3.7), (C.6) and (C.5), Proposition 3.7 further implies that

    βm=βandλm=(1+ρ)𝔼[U(WI(X))]2β𝔼[I(X)].\beta^{*}_{m}=\beta^{*}\qquad\text{and}\qquad\lambda^{*}_{m}=(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]-2\beta^{*}{\mathbb{E}}[I^{*}(X)]. (C.7)

    Next, we consider two subcases that depend on the values of λm\lambda^{*}_{m} and U(w0(1+ρ)m)U^{\prime}(w_{0}-(1+\rho)m).

    1. (C.1)

      If λm<U(w0(1+ρ)m)\lambda^{*}_{m}<U^{\prime}(w_{0}-(1+\rho)m), then I(x)=xI^{*}(x)=x 0xx^\forall 0\leqslant x\leqslant\hat{x} and

      U(w0x+I(x)(1+ρ)m)2βI(x)λm{=0,xx^,>0,x<x^,U^{\prime}(w_{0}-x+I^{*}(x)-(1+\rho)m)-2\beta^{*}I^{*}(x)-\lambda^{*}_{m}\left\{\begin{array}[]{ll}=0,&x\geqslant\hat{x},\\ >0,&x<\hat{x},\end{array}\right.

      where x^:=U(w0(1+ρ)m)λm2β>0\hat{x}:=\frac{U^{\prime}(w_{0}-(1+\rho)m)-\lambda^{*}_{m}}{2\beta^{*}}>0. Therefore, it follows that

      0\displaystyle 0 <\displaystyle< 𝔼[U(w0X+I(X)(1+ρ)m)2βI(X)λm]\displaystyle{\mathbb{E}}[U^{\prime}(w_{0}-X+I^{*}(X)-(1+\rho)m)-2\beta^{*}I^{*}(X)-\lambda^{*}_{m}]
      =\displaystyle= 𝔼[U(WI(X))]2β𝔼[I(X)](1+ρ)𝔼[U(WI(X))]+2β𝔼[I(X)]\displaystyle{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]-2\beta^{*}{\mathbb{E}}[I^{*}(X)]-(1+\rho){\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]+2\beta^{*}{\mathbb{E}}[I^{*}(X)]
      =\displaystyle= ρ𝔼[U(WI(X))],\displaystyle-\rho{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))],

      leading to a contradiction.

    2. (C.2)

      Consequently, we must have λmU(w0(1+ρ)m)\lambda^{*}_{m}\geqslant U^{\prime}(w_{0}-(1+\rho)m), in which case II^{*} is coinsurance above a deductible (i.e., (3.5)). Similarly to Subcase Case (C)(C.1), we can show that

      U(w0x+I(x)(1+ρ)m)2βI(x)λm{=0,xd~,<0,x<d~,U^{\prime}(w_{0}-x+I^{*}(x)-(1+\rho)m)-2\beta^{*}I^{*}(x)-\lambda^{*}_{m}\left\{\begin{array}[]{ll}=0,&x\geqslant\tilde{d},\\ <0,&x<\tilde{d},\end{array}\right.

      where d~:=w0(1+ρ)m(U)1(λm)0\tilde{d}:=w_{0}-(1+\rho)m-(U^{\prime})^{-1}(\lambda_{m}^{*})\geqslant 0. This, together with (C.7), yields

      λm=U(w0d~(1+ρ)m)\lambda^{*}_{m}=U^{\prime}(w_{0}-\tilde{d}-(1+\rho)m) (C.8)

      and

      ρ𝔼[U(WI(X))]=𝔼[(U(w0X+I(X)(1+ρ)m)2βI(X)λm)𝕀{X<d~}].-\rho{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]={\mathbb{E}}\Big{[}\big{(}U^{\prime}(w_{0}-X+I^{*}(X)-(1+\rho)m)-2\beta^{*}I^{*}(X)-\lambda^{*}_{m}\big{)}\mathbb{I}_{\{X<\tilde{d}\}}\Big{]}. (C.9)

      Therefore, ρ=0\rho=0 if and only if d~=0\tilde{d}=0. Moreover, if ρ=0\rho=0, the above analysis indicates that II^{*} solves the equation (3.15). Otherwise, if ρ>0\rho>0, then it follows from (C.7) and (C.9) that

      𝔼[U(WI(X))]=𝔼[U(w0X(1+ρ)m)𝕀{Xd~}]+2βmFX(d~)1(1+ρ){X>d~},{\mathbb{E}}[U^{\prime}(W_{I^{*}}(X))]=\frac{{\mathbb{E}}[U^{\prime}(w_{0}-X-(1+\rho)m){\mathbb{I}}_{\{X\leqslant\tilde{d}\}}]+2\beta^{*}mF_{X}(\tilde{d})}{1-(1+\rho){\mathbb{P}}\{X>\tilde{d}\}},

      which in turn implies d~>VaR11+ρ(X)\tilde{d}>VaR_{\frac{1}{1+\rho}}(X). Plugging the above equation and (C.8) into (C.7) yields

      2β=1mρ(U(w0d~(1+ρ)m)(1+ρ)𝔼[U(w0Xd~(1+ρ)m)]).2\beta^{*}=\frac{1}{m\rho}\left(U^{\prime}(w_{0}-\tilde{d}-(1+\rho)m)-(1+\rho){\mathbb{E}}[U^{\prime}(w_{0}-X\wedge\tilde{d}-(1+\rho)m)]\right).

      As a result, the optimal solution II^{*} must be given by (3.17). The proof is complete.


Proof of Corollary 3.9:

We prove for the case in which ρ>0\rho>0, but note that the proof to the case of ρ=0\rho=0 is similar and indeed simpler. It follows from (3.5) and (3.8) that Iλm,βm(x)=0I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=0 for xd~x\leqslant\tilde{d} and Iλm,βm(x)=fλm,βm(x)I^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)=f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x) increases in xx for x>d~x>\tilde{d}, where we use the fact that xIλm,βm(x)x-I_{\lambda^{*}_{m},\beta^{*}_{m}}(x) increases in xx, and βm>0\beta^{*}_{m}>0. Moreover, for x>d~x>\tilde{d}, taking the derivative of Iλm,βm(x)x\frac{I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x} with respect to xx yields

(Iλm,βm(x)x)=(fλm,βm(x)x)=fλm,βm(x)xfλm,βm(x)x2d~x(fλm,βm(x)fλm,βm(y))dyx20.\Big{(}\frac{I_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x}\Big{)}^{\prime}=\Big{(}\frac{f_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x}\Big{)}^{\prime}=\frac{f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)x-f_{\lambda^{*}_{m},\beta^{*}_{m}}(x)}{x^{2}}\geqslant\frac{\int_{\tilde{d}}^{x}\big{(}f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(x)-f^{\prime}_{\lambda^{*}_{m},\beta^{*}_{m}}(y)\big{)}{\mathrm{d}}y}{x^{2}}\geqslant 0.

This completes the proof.

Proof of Theorem 4.1:

We first prove the result assuming that there exists x0(0,]x_{0}\in\left(0,\mathcal{M}\right] such that

y0:=eI1(x0)=eI2(x0)andϕ(x0y0)>β1β2,y_{0}:=e_{I_{1}^{*}}(x_{0})=e_{I_{2}^{*}}(x_{0})\quad\text{and}\quad\phi(x_{0}-y_{0})>\frac{\beta_{1}^{*}}{\beta_{2}^{*}}, (C.10)

where ϕ\phi is defined in (B.1). First, we show that eI1(x)e_{I_{1}^{*}}(x) up-crosses eI2(x)e_{I_{2}^{*}}(x) in a neighbour of x0x_{0}. To this end, we first note

eI1(x0)eI2(x0)=11+2β1U′′(w1x0+y0)11+2β2U′′(w2x0+y0)>0,e_{I_{1}^{*}}^{\prime}(x_{0})-e_{I_{2}^{*}}^{\prime}(x_{0})=\frac{1}{1+\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{1}-x_{0}+y_{0})}}-\frac{1}{1+\frac{2\beta_{2}^{*}}{-U^{\prime\prime}(w_{2}-x_{0}+y_{0})}}>0,

which in turn implies that there exists an ϵ>0\epsilon>0 such that

{eI1(x)<eI2(x),x(x0ϵ,x0),eI1(x)>eI2(x),x(x0,x0+ϵ).\left\{\begin{array}[]{ll}e_{I_{1}^{*}}(x)<e_{I_{2}^{*}}(x),&x\in(x_{0}-\epsilon,x_{0}),\\ e_{I_{1}^{*}}(x)>e_{I_{2}^{*}}(x),&x\in(x_{0},x_{0}+\epsilon).\end{array}\right.

Next, we show that there exists no y>x0y>x_{0} such that eI1(y)=eI2(y)e_{I_{1}^{*}}(y)=e_{I_{2}^{*}}(y). Otherwise, if such yy existed, then the increasing property of ϕ(z)\phi(z) in zz, together with the strictly increasing property of xeIi(x)x-e_{I_{i}^{*}}(x) in xx, would imply

ϕ(yeI1(y))=ϕ(yeI2(y))>ϕ(x0eI2(x0))>β1β2,\phi(y-e_{I_{1}^{*}}(y))=\phi(y-e_{I_{2}^{*}}(y))>\phi(x_{0}-e_{I_{2}^{*}}(x_{0}))>\frac{\beta_{1}^{*}}{\beta_{2}^{*}},

which would in turn yield that eI1e_{I_{1}^{*}} up-crosses eI2e_{I_{2}^{*}} in a neighbour of yy. This, however, contradicts the fact that eI1(x)>eI2(x)e_{I_{1}^{*}}(x)>e_{I_{2}^{*}}(x) for x(x0,x0+ϵ)x\in(x_{0},x_{0}+\epsilon).

Now define

x1:=sup{x[0,x0):eI2(x)<eI1(x)}.x_{1}:=\sup\left\{x\in\left[0,x_{0}\right):e_{I_{2}^{*}}(x)<e_{I_{1}^{*}}(x)\right\}.

If x1=0x_{1}=0, then eI1e_{I_{1}^{*}} up-crosses eI2e_{I_{2}^{*}}, which contradicts Lemma B.4. Thus, we must have 0<x1<x00<x_{1}<x_{0} because eI1(x)<eI2(x)e_{I_{1}^{*}}(x)<e_{I_{2}^{*}}(x) for all x(x0ϵ,x0)x\in(x_{0}-\epsilon,x_{0}). Moreover, it follows readily that

ϕ(x1eI1(x1))β1β2.\phi\left(x_{1}-e_{I_{1}^{*}}(x_{1})\right)\leqslant\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.

In the following, we show that there exists no point x2(0,x1)x_{2}\in\left(0,x_{1}\right) such that eI1(x2)=eI2(x2)e_{I_{1}^{*}}(x_{2})=e_{I_{2}^{*}}(x_{2}). Indeed, if such x2x_{2} existed, then, noting that ϕ\phi is strictly increasing, we would have

ϕ(x2eI1(x2))<β1β2,\phi\left(x_{2}-e_{I_{1}^{*}}(x_{2})\right)<\frac{\beta_{1}^{*}}{\beta_{2}^{*}},

leading to eI1(x2)eI2(x2)<0.e_{I_{1}^{*}}^{\prime}(x_{2})-e_{I_{2}^{*}}^{\prime}(x_{2})<0. In other words, eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} at x2x_{2}. This contradicts the fact that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} at x1x_{1}. Therefore, we can now conclude that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} twice when (C.10) is satisfied.

Let us now consider the case in which (C.10) is not satisfied. We study two cases:

  1. Case (A)

    If there exists no x(0,]x\in\left(0,\mathcal{M}\right] such that eI1(x)=eI2(x)e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x), then it is easy to show from 𝔼[eIi(X)]=0{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0 that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) have the same distribution. This contradicts Lemma B.4.

  2. Case (B)

    Otherwise, any x(0,]x\in\left(0,\mathcal{M}\right] satisfying eI1(x)=eI2(x)e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x) must have

    ϕ(xeIi(x))β1β2.\phi(x-e_{I_{i}^{*}}(x))\leqslant\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.

    If the above inequality is always strict, then the previous analysis shows that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}}, which contradicts Lemma B.4. Hence, there must exist x3(0,]x_{3}\in\left(0,\mathcal{M}\right] such that

    y3:=eI1(x3)=eI2(x3)andϕ(x3y3)=β1β2.y_{3}:=e_{I_{1}^{*}}(x_{3})=e_{I_{2}^{*}}(x_{3})\quad\text{and}\quad\phi(x_{3}-y_{3})=\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.

    In this case, we further divide our analysis into three subcases.

    1. (B.1)

      If eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} at x3x_{3}, then the above analysis would imply that no up-crossing occurs before x3x_{3}. A similar argument indicates that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) would have the same distribution, which would not be possible.

    2. (B.2)

      Otherwise, if eI1e_{I_{1}^{*}} up-crosses eI2e_{I_{2}^{*}} at x3x_{3}, then the previous analysis shows that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} twice.

    3. (B.3)

      Finally, if no up-crossing happens at x3x_{3}, then we can simply neglect this single point x3x_{3} in the analysis. If there further exists x4(0,x3)x_{4}\in(0,x_{3}) satisfying eI1(x4)=eI2(x4)e_{I_{1}^{*}}(x_{4})=e_{I_{2}^{*}}(x_{4}), then we have

      ϕ(x4eIi(x4))<β1β2.\phi(x_{4}-e_{I_{i}^{*}}(x_{4}))<\frac{\beta_{1}^{*}}{\beta_{2}^{*}}.

      In this case, the previous analysis indicates that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} at x4x_{4}, which contradicts Lemma B.4. Otherwise, if there exists no x(0,x3)x\in\left(0,x_{3}\right) such that eI1(x)=eI2(x)e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x), then it follows from 𝔼[eIi(X)]=0{\mathbb{E}}[e_{I_{i}^{*}}(X)]=0 that eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) have the same distribution. This again contradicts Lemma B.4.

In summary, we have shown that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} twice. Because eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) have the same first two moments, we can easily see that the insurer’s profit with the wealthier insured, eI2(X)-e_{I_{2}^{*}}(X), has less downside risk than the counterpart with the less wealthy insured, eI1(X)-e_{I_{1}^{*}}(X), when the insurance pricing is actuarially fair. The proof is complete.

Proof of Corollary 4.2:

(i) It follows from the proof of Theorem 4.1 that eI2(0)<eI1(0),e_{I_{2}^{*}}(0)<e_{I_{1}^{*}}(0), which is equivalent to 𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)]. Suppose that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}} at points x0x_{0} and x1x_{1} with 0<x0<x10<x_{0}<x_{1}. Then eI1(xj)=eI2(xj)e_{I_{1}^{*}}(x_{j})=e_{I_{2}^{*}}(x_{j}) for j=0,1j=0,1, which, together with (4.2), implies

U(w1x0+eI1(x0))U(w2x0+eI1(x0))2(β1β2)eI1(x0)\displaystyle U^{\prime}(w_{1}-x_{0}+e_{I_{1}^{*}}(x_{0}))-U^{\prime}(w_{2}-x_{0}+e_{I_{1}^{*}}(x_{0}))-2(\beta_{1}^{*}-\beta_{2}^{*})e_{I_{1}^{*}}(x_{0})
=U(w1x1+eI1(x1))U(w2x1+eI1(x1))2(β1β2)eI1(x1).\displaystyle=U^{\prime}(w_{1}-x_{1}+e_{I_{1}^{*}}(x_{1}))-U^{\prime}(w_{2}-x_{1}+e_{I_{1}^{*}}(x_{1}))-2(\beta_{1}^{*}-\beta_{2}^{*})e_{I_{1}^{*}}(x_{1}). (C.11)

Denote

L(y):=U(w1y)U(w2y),y[0,w1).L(y):=U^{\prime}(w_{1}-y)-U^{\prime}(w_{2}-y),\;\;y\in[0,w_{1}).

Then

L(y)=U′′(w1y)+U′′(w2y)>0L^{\prime}(y)=-U^{\prime\prime}(w_{1}-y)+U^{\prime\prime}(w_{2}-y)>0

due to w1<w2w_{1}<w_{2} and U′′′>0U^{\prime\prime\prime}>0. Recalling that xeIi(x)x-e_{I_{i}^{*}}(x) and eIi(x)e_{I_{i}^{*}}(x) are strictly increasing in xx, we deduce from (C.11) that β2<β1.\beta_{2}^{*}<\beta_{1}^{*}.

(ii) Let us denote w~i:=wi𝔼[Ii(X)]\widetilde{w}_{i}:=w_{i}-{\mathbb{E}}[I^{*}_{i}(X)], i=1,2.i=1,2. The following analysis depends on the comparison between w~1\widetilde{w}_{1} and w~2\widetilde{w}_{2}.

  1. Case (A)

    If w~1=w~2\widetilde{w}_{1}=\widetilde{w}_{2} and there exists z[0,]z\in[0,\mathcal{M}] such that I1(z)=I2(z)I^{*}_{1}(z)=I^{*}_{2}(z), then it follows from (3.15) that 2(β1β2)Ii(z)=0,2(\beta^{*}_{1}-\beta^{*}_{2})I^{*}_{i}(z)=0, i=1,2i=1,2. Recalling that β2<β1,\beta_{2}^{*}<\beta_{1}^{*}, we have Ii(z)=0I^{*}_{i}(z)=0, which implies z=0z=0. Because 𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)], we conclude that I1(x)<I2(x)I^{*}_{1}(x)<I^{*}_{2}(x) x(0,]\forall x\in(0,\mathcal{M}].

  2. Case (B)

    Otherwise, if w~1w~2\widetilde{w}_{1}\neq\widetilde{w}_{2}, we can show that either I1(x)<I2(x)I^{*}_{1}(x)<I^{*}_{2}(x) x>0\forall x>0 or I1I^{*}_{1} up-crosses I2I^{*}_{2}. Indeed, if there exists z(0,]z\in(0,\mathcal{M}] such that I1(z)=I2(z)I^{*}_{1}(z)=I^{*}_{2}(z), then we have

    U(w~1z+I1(z))U(w~1)(U(w~2z+I2(z))U(w~2))\displaystyle U^{\prime}(\widetilde{w}_{1}-z+I^{*}_{1}(z))-U^{\prime}(\widetilde{w}_{1})-(U^{\prime}(\widetilde{w}_{2}-z+I^{*}_{2}(z))-U^{\prime}(\widetilde{w}_{2}))
    =2(β1β2)I2(z)>0.\displaystyle=2(\beta_{1}^{*}-\beta_{2}^{*})I^{*}_{2}(z)>0.

    Noting that U(w~y)U(w~)U^{\prime}(\widetilde{w}-y)-U^{\prime}(\widetilde{w}) is strictly decreasing in w~\widetilde{w} for any y>0y>0 because of U′′′>0U^{\prime\prime\prime}>0, we must have w~1<w~2.\widetilde{w}_{1}<\widetilde{w}_{2}. Furthermore, we have

    Ii(z)=11+2βiU′′(w~iz+Ii(z))=11+U(w~iz+Ii(z))U(w~i)U′′(w~iz+Ii(z))×Ii(z),i=1,2.{I_{i}^{*}}^{\prime}(z)=\frac{1}{1+\frac{2\beta_{i}^{*}}{-U^{\prime\prime}(\widetilde{w}_{i}-z+I_{i}^{*}(z))}}=\frac{1}{1+\frac{U^{\prime}(\widetilde{w}_{i}-z+I^{*}_{i}(z))-U^{\prime}(\widetilde{w}_{i})}{-U^{\prime\prime}(\widetilde{w}_{i}-z+I_{i}^{*}(z))\times I_{i}^{*}(z)}},\;\;i=1,2.

    Denoting

    H(w):=U(wy)U(w)U′′(wy),y[0,w),H(w):=\frac{U^{\prime}(w-y)-U^{\prime}(w)}{-U^{\prime\prime}(w-y)},\;\;y\in[0,w),

    we obtain

    H(w)\displaystyle H^{\prime}(w) =\displaystyle= U(wy)U(w)U′′(wy)[𝒫U(wy)+U′′(wy)U′′(w)U(wy)U(w)]\displaystyle\frac{U^{\prime}(w-y)-U^{\prime}(w)}{-U^{\prime\prime}(w-y)}\Big{[}\mathcal{P}_{U}(w-y)+\frac{U^{\prime\prime}(w-y)-U^{\prime\prime}(w)}{U^{\prime}(w-y)-U^{\prime}(w)}\Big{]}
    =\displaystyle= H(w)[𝒫U(wy)𝒫U(wθy)]>0,\displaystyle H(w)\Big{[}\mathcal{P}_{U}(w-y)-\mathcal{P}_{U}(w-\theta y)\Big{]}>0,

    where the second equality is due to the mean-value theorem with θ(0,1)\theta\in(0,1) and the last inequality is due to the assumption of strict DAP. The strictly increasing property of H(w)H(w), together with the fact that w~1<w~2\widetilde{w}_{1}<\widetilde{w}_{2}, yields I2(z)<I1(z),{I_{2}^{*}}^{\prime}(z)<{I_{1}^{*}}^{\prime}(z), which, in turn, implies that I1I^{*}_{1} up-crosses I2I^{*}_{2} at point zz.

    Otherwise, if I1(x)I2(x)I^{*}_{1}(x)\neq I^{*}_{2}(x) x(0,)\forall x\in(0,\mathcal{M}), then it follows from 𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)] that I1(x)<I2(x)I^{*}_{1}(x)<I^{*}_{2}(x) x>0\forall x>0. The proof is complete.

Proof of Theorem 4.3:

If there exists no point x(0,)x\in(0,\mathcal{M}) such that eI1(x)=eI2(x)e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x), then eI1(X)e_{I_{1}^{*}}(X) and eI2(X)e_{I_{2}^{*}}(X) are equal in distribution due to 𝔼[eI1(X)]=𝔼[eI2(X)]=0{\mathbb{E}}[e_{I_{1}^{*}}(X)]={\mathbb{E}}[e_{I_{2}^{*}}(X)]=0. This contradicts the fact that var[eI1(X)]=ν1<ν2=var[eI2(X)]var[e_{I_{1}^{*}}(X)]=\nu_{1}<\nu_{2}=var[e_{I_{2}^{*}}(X)]. Therefore, we must have eI1(z)=eI2(z)e_{I_{1}^{*}}(z)=e_{I_{2}^{*}}(z) for some z(0,)z\in(0,\mathcal{M}); and thus (4.4) implies

𝔼[U(w0X+eI2(X))]𝔼[U(w0X+eI1(X))]=2(β1β2)eI1(z).{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))]-{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]=2(\beta^{*}_{1}-\beta^{*}_{2})e_{I_{1}^{*}}(z).

We divide the following proof into two cases by comparing 𝔼[U(w0X+eI1(X))]{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))] with 𝔼[U(w0X+eI2(X))]{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))].

  1. Case (A)

    If 𝔼[U(w0X+eI1(X))]𝔼[U(w0X+eI2(X))]{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]\neq{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))], then β1β2\beta^{*}_{1}\neq\beta^{*}_{2}. Since eI1e_{I_{1}^{*}} is a strictly increasing function, we have 𝒳:={x[0,]:eI1(x)=eI2(x)}={z}\mathcal{X}:=\{x\in[0,\mathcal{M}]:e_{I_{1}^{*}}(x)=e_{I_{2}^{*}}(x)\}=\{z\}. Recalling that var[eI1(X)]<var[eI2(X)]var[e_{I_{1}^{*}}(X)]<var[e_{I_{2}^{*}}(X)], we conclude from Lemma A.3 that eI2e_{I_{2}^{*}} up-crosses eI1.e_{I_{1}^{*}}.

  2. Case (B)

    If 𝔼[U(w0X+eI1(X))]=𝔼[U(w0X+eI2(X))]{\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{1}^{*}}(X))]={\mathbb{E}}[U^{\prime}(w_{0}-X+e_{I_{2}^{*}}(X))], we have

    U(w0x+eI1(x))2β1eI1(x)=U(w0x+eI2(x))2β2eI2(x).\displaystyle U^{\prime}(w_{0}-x+e_{I_{1}^{*}}(x))-2\beta_{1}^{*}e_{I_{1}^{*}}(x)=U^{\prime}(w_{0}-x+e_{I_{2}^{*}}(x))-2\beta_{2}^{*}e_{I_{2}^{*}}(x). (C.12)

    In this case, we further divide our analysis into two subcases based on the comparison between β1\beta^{*}_{1} and β2\beta^{*}_{2}.

    1. (B.1)

      If β1β2\beta^{*}_{1}\neq\beta^{*}_{2}, then it follows from (C.12) that eI1(x)=0e_{I_{1}^{*}}(x)=0 x𝒳\forall x\in\mathcal{X}. Similar to Case (A), we can show that eI2e_{I_{2}^{*}} up-crosses eI1.e_{I_{1}^{*}}.

    2. (B.2)

      If β1=β2\beta^{*}_{1}=\beta^{*}_{2}, then (C.12) can be rewritten as

      U(w0x+eI1(x))+2β1(xeI1(x))=U(w0x+eI2(x))+2β1(xeI2(x)).\displaystyle U^{\prime}(w_{0}-x+e_{I_{1}^{*}}(x))+2\beta_{1}^{*}(x-e_{I_{1}^{*}}(x))=U^{\prime}(w_{0}-x+e_{I_{2}^{*}}(x))+2\beta_{1}^{*}(x-e_{I_{2}^{*}}(x)).

      Note that U(w0y)+2β1yU^{\prime}(w_{0}-y)+2\beta_{1}^{*}y is strictly increasing in yy. Hence, the above equation yields xeI1(x)=xeI2(x)x-e_{I_{1}^{*}}(x)=x-e_{I_{2}^{*}}(x) for all x[0,]x\in[0,\mathcal{M}], which contradicts the fact that var[eI1(X)]<var[eI2(X)]var[e_{I_{1}^{*}}(X)]<var[e_{I_{2}^{*}}(X)].

In summary, we conclude that eI2e_{I_{2}^{*}} up-crosses eI1e_{I_{1}^{*}}.

Proof of Corollary 4.4:

(i) From the proof of Theorem 4.3, we know that

eI2(x)eI1(x){<0,x<z;>0,x(z,]e_{I_{2}^{*}}(x)-e_{I_{1}^{*}}(x)\left\{\begin{array}[]{ll}<0,&x<z;\\ >0,&x\in(z,\mathcal{M}]\end{array}\right. (C.13)

for some z(0,)z\in(0,\mathcal{M}). Therefore, we have

𝔼[I1(X)]=eI1(0)<eI2(0)=𝔼[I2(X)].{\mathbb{E}}[I^{*}_{1}(X)]=-e_{I_{1}^{*}}(0)<-e_{I_{2}^{*}}(0)={\mathbb{E}}[I^{*}_{2}(X)].

Furthermore, eI1(z)eI2(z)0.e_{I_{1}^{*}}^{\prime}(z)-e_{I_{2}^{*}}^{\prime}(z)\leqslant 0. Recalling that

eI1(x)=11+2β1U′′(w0x+eI1(x)),e_{I_{1}^{*}}^{\prime}(x)=\frac{1}{1+\frac{2\beta_{1}^{*}}{-U^{\prime\prime}(w_{0}-x+e_{I_{1}^{*}}(x))}},

we have β2β1.\beta_{2}^{*}\leqslant\beta_{1}^{*}. As β1=β2\beta_{1}^{*}=\beta_{2}^{*} does not hold, as shown in the proof of Theorem 4.3, we obtain β2<β1\beta_{2}^{*}<\beta_{1}^{*}.

(ii) On the one hand, in view of (C.13), it follows from 𝔼[I1(X)]<𝔼[I2(X)]{\mathbb{E}}[I^{*}_{1}(X)]<{\mathbb{E}}[I^{*}_{2}(X)] that I2(x)>I1(x)I_{2}^{*}(x)>I_{1}^{*}(x) xz\forall x\geqslant z. On the other hand, for any x[0,z),x\in[0,z), noting that eI2(x)<eI1(x)e_{I_{2}^{*}}(x)<e_{I_{1}^{*}}(x), we deduce from U′′′0U^{\prime\prime\prime}\geqslant 0 that

U′′(w0x+eI1(x))U′′(w0x+eI2(x)).-U^{\prime\prime}(w_{0}-x+e_{I_{1}^{*}}(x))\leqslant-U^{\prime\prime}(w_{0}-x+e_{I_{2}^{*}}(x)).

Because

β2<β1andeIi(x)=11+2βiU′′(w0x+eIi(x)),i=1,2,\beta_{2}^{*}<\beta^{*}_{1}\quad\text{and}\quad e_{I_{i}^{*}}^{\prime}(x)=\frac{1}{1+\frac{2\beta_{i}^{*}}{-U^{\prime\prime}(w_{0}-x+e_{I_{i}^{*}}(x))}},\;\;i=1,2,

we have eI1(x)<eI2(x),e_{I_{1}^{*}}^{\prime}(x)<e_{I_{2}^{*}}^{\prime}(x), which is equivalent to I1(x)<I2(x){I_{1}^{*}}^{\prime}(x)<{I_{2}^{*}}^{\prime}(x) x[0,z)\forall x\in[0,z). Furthermore, as I1(0)=I2(0)=0I_{1}^{*}(0)=I_{2}^{*}(0)=0, it must hold that I1(x)<I2(x)I_{1}^{*}(x)<I_{2}^{*}(x) x0\forall x\geqslant 0. The proof is complete.

References

  • Armantier et al. (2018) Armantier, O., Foncel, J. and Treich, N. (2018) Insurance and portfolio decisions: A wealth effect puzzle. Working paper.
  • Arrow (1963) Arrow, K.J. (1963) Uncertainty and the welfare economics of medical care. American Economic Review, 53(5), 941-973.
  • Arrow (1965) Arrow, K.J. (1965) Aspects of the Theory of Risk-bearing. Helsinki: Yrjo Jahnssonin Saatio.
  • Arrow (1971) Arrow, K.J. (1971) Essays in the Theory of Risk Bearing. Chicago.
  • Bernard et al. (2015) Bernard, C., He, X.D., Yan, J.A. and Zhou, X.Y. (2015) Optimal insurance design under rank dependent utility. Mathematical Finance, 25(1), 154-186.
  • Borch (1960) Borch, K. (1960) An attempt to determine the optimum amount of stop loss reinsurance. In: Transactions of the 16th International Congress of Actuaries, Vol. I, 597-610. Brussels, Belgium: Georges Thone.
  • Carlier and Dana (2003) Carlier, G. and Dana, R.-A. (2003) Pareto efficient insurance contracts when the insurer’s cost function is discontinuous. Economic Theory, 21(4), 871-893.
  • Carlier and Dana (2005) Carlier, G. and Dana, R.-A. (2005) Rearrangement inequalities in non-convex insurance models. Journal of Mathematical Economics, 41, 483-503.
  • Chi (2012) Chi, Y. (2012) Reinsurance arrangements minimizing the risk-adjusted value of an insurer’s liability. ASTIN Bulletin: The Journal of the IAA, 42(2), 529-557.
  • Chi (2019) Chi, Y. (2019) On the optimality of a straight deductible under belief heterogeneity. ASTIN Bulletin: The Journal of the IAA, 49(1), 242-263.
  • Chi and Wei (2020) Chi, Y. and Wei, W. (2020) Optimal insurance with background risk: An analysis of general dependence structures. Finance and Stochastics, Forthcoming.
  • Cummins and Mahul (2004) Cummins, J.D. and Mahul, O. (2004) The demand for insurance with an upper limit on coverage. The Journal of Risk and Insurance, 71(2), 253-264.
  • Dionne and St-Michel (1991) Dionne, G. and St-Michel, P. (1991) Workers’ compensation and moral hazard. The Review of Economics and Statistics, 73(2), 236-244.
  • Doherty et al. (2015) Doherty, N.A., Laux, C. and Muermann, A. (2015) Insuring nonverifiable losses. Review of Finance, 19(1), 283-316.
  • Doherty and Schlesinger (1990) Doherty, N.A. and Schlesinger, H. (1990) Rational insurance purchasing: Consideration of contract non-performance. The Quarterly Journal of Economics, 105(1), 243-253.
  • Eeckhoudt and Kimball (1992) Eeckhoudt, L. and Kimball, M. (1992) Background risk, prudence, and the demand for insurance. Contributions to Insurance Economics, edited by G. Dionne, 239-254. New York: Springer.
  • Eeckhoudt et al. (2003) Eeckhoudt, L., Mahul, O. and Moran, J. (2003) Fixed‐reimbursement insurance: Basic properties and comparative statics. Journal of Risk and Insurance, 70(2), 207-218.
  • Ehrlich and Becker (1972) Ehrlich, I. and Becker, G.S. (1972) Market insurance, self-insurance, and self-protection. Journal of political Economy, 80(4), 623-648.
  • Gollier (1996) Gollier, C. (1996) Optimum insurance of approximate losses. The Journal of Risk and Insurance, 63(3), 369-380.
  • Gollier (2001) Gollier, C. (2001) The Economics of Risk and Time. London: The MIT Press.
  • Gollier (2013) Gollier, C. (2013) The economics of optimal insurance design. Handbook of Insurance, edited by G. Dionne. New York: Springer.
  • Gollier and Schlesinger (1996) Gollier, C. and Schlesinger, H. (1996). Arrow’s theorem on the optimality of deductibles: a stochastic dominance approach. Economic Theory, 7(2), 359-363.
  • Gotoh and Konno (2000) Gotoh, J.Y. and Konno, H. (2000) Third degree stochastic dominance and mean-risk analysis. Management Science, 46(2), 289-301.
  • Huang and Tzeng (2006) Huang, R.J. and Tzeng, L.Y. (2006) The design of an optimal insurance contract for irreplaceable commodities. The Geneva Risk and Insurance Review, 31(1), 11-21.
  • Hofmann (2015) Hofmann, D.M. (2015) Insurance - a global view. Second edition, Zurich Insurance Company Ltd.
  • Hölmstrom (1979) Hölmstrom, B. (1979) Moral hazard and observability. The Bell journal of economics, 10(1), 74-91.
  • Huberman et al. (1983) Huberman, G., Mayers, D. and Smith, C.W. (1983) Optimal insurance policy indemnity schedules. The Bell Journal of Economics, 14(2), 415-426.
  • Kaluszka (2001) Kaluszka, M. (2001) Optimal reinsurance under mean-variance premium principles. Insurance: Mathematics and Economics, 28(1), 61-67.
  • Karlin and Novikoff (1963) Karlin, S. and Novikoff, A. (1963) Generalized convex inequalities. Pacific Journal of Mathematics, 13(4), 1251-1279.
  • Kaye (2005) Kaye, P. (2005) Risk measurement in insurance: A guide to risk measurement, capital allocation and related decision support issues. Discussion paper program, Casualty Actuarial Society.
  • Kimball (1990) Kimball, M.S. (1990) Precautionary saving in the small and in the large. Econometrica, 58(1), 53-73.
  • Komiya (1988) Komiya, H. (1988) Elementary proof for sion’s minimax theorem. Kodai Mathematical Journal, 11(1), 5-7.
  • Markowitz (1952) Markowitz, H. (1952) Portfolio selection. Journal of Finance, 7, 77–91.
  • Menezes et al. (1980) Menezes, C., Geiss, C. and Tressler, J. (1980) Increasing downside risk. The American Economic Review, 70(5), 921-932.
  • Millo (2016) Millo, G. (2016) The income elasticity of nonlife insurance: A reassessment. Journal of Risk and Insurance, 83(2), 335-362.
  • Mossin (1968) Mossin, J. (1968) Aspects of rational insurance purchasing. Journal of Political Economy, 76(4), 553-568.
  • Noussair et al. (2014) Noussair, C.N., Trautmann, S.T. and Van de Kuilen, G. (2014) Higher order risk attitudes, demographics, and financial decisions. The Review of Economic Studies, 81(1), 325-355.
  • Ohlin (1969) Ohlin, J. (1969) On a class of measures of dispersion with application to optimal reinsurance. ASTIN Bulletin: The Journal of the IAA, 5(2), 249-266.
  • Picard (2000) Picard, P. (2000) On the design of optimal insurance policies under manipulation of audit cost. International Economic Review, 41(4), 1049-1071.
  • Pratt (1964) Pratt, J.W. (1964) Risk aversion in the small and in the large. Econometrica, 32, 122-136.
  • Raviv (1979) Raviv, A. (1979) The design of an optimal insurance policy. American Economic Review, 69(1), 84-96.
  • Rothschild and Stiglitz (1976) Rothschild, M. and Stiglitz, J. (1976) Equilibrium in competitive insurance markets: An essay on the economics of imperfect information. The Quarterly Journal of Economics, 90(4), 629-649
  • Schlesinger (1981) Schlesinger, H. (1981) The optimal level of deductibility in insurance contracts. Journal of risk and insurance, 48(3), 465-481.
  • Teh (2017) Teh, T.L. (2017) Insurance design in the presence of safety nets. Journal of Public Economics, 149, 47-58.
  • Vajda (1962) Vajda, S. (1962) Minimum variance reinsurance. ASTIN Bulletin: The Journal of the IAA, 2, 257-260.
  • Viscusi (1979) Viscusi, W.K. (1979) Insurance and individual incentives in adaptive contexts. Econometrica, 47(5), 1195-1207.
  • Whitmore (1970) Whitmore, G.A. (1970) Third-degree stochastic dominance. The American Economic Review, 60(3), 457-459.
  • Xu et al. (2019) Xu, Z.Q., Zhou, X.Y. and Zhuang, S.C. (2019) Optimal insurance with rank-dependent utility and incentive compatibility. Mathematical Finance, 29(2), 659-692.
  • Zhou and Wu (2008) Zhou, C. and Wu, C. (2008) Optimal insurance under the insurer’s risk constraint. Insurance: Mathematics and Economics, 42(3), 992-999.
  • Zhou et al. (2010) Zhou, C., Wu, W. and Wu, C. (2010) Optimal insurance in the presence of insurer’s loss limit. Insurance: Mathematics and Economics, 46(2), 300-307.