This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Modeling frequency distribution above a priority in presence of IBNR

Nicolas Baradel111Inria, CMAP, CNRS, École polytechnique, Institut Polytechnique de Paris, 91200 Palaiseau, nicolas.baradel@polytechnique.edu.
Abstract

In reinsurance, Poisson and Negative binomial distributions are employed for modeling frequency. However, the incomplete data regarding reported incurred claims above a priority level presents challenges in estimation. This paper focuses on frequency estimation using Schnieper’s framework [7] for claim numbering. We demonstrate that Schnieper’s model is consistent with a Poisson distribution for the total number of claims above a priority at each year of development, providing a robust basis for parameter estimation. Additionally, we explain how to build an alternative assumption based on a Negative binomial distribution, which yields similar results. The study includes a bootstrap procedure to manage uncertainty in parameter estimation and a case study comparing assumptions and evaluating the impact of the bootstrap approach.

1 Introduction

In his profession, a reinsurer has to quote prices for excess of loss covers. Generally, the reinsurer estimates the frequency and severity distributions. For the frequency, the most common choices are the Poisson and Negative binomial distributions. The Poisson distribution can be viewed as the natural distribution in an ideal world: when all claims are independent and occur with a non-random intensity, the distribution is Poisson. However, when there is some uncaptured randomness, the variance is greater than the mean. In the particular case where the intensity of a Poisson distribution follows a Gamma distribution, the overall distribution is known to be a Negative binomial one.

The data the reinsurer receives are often incomplete: only the reported incurred claims above a certain threshold, typically known as the priority. In the context of excess of loss, [7, Schnieper] proposed a model that separates the IBNR (Incurred But Not Reported) into what he termed true IBNR: newly reported claims, and IBNER (Incurred But Not Enough Reported): variation in estimated cost over time.

Although [4, Mack] used some of the ideas from the Schnieper’s method, it has not received much attention. Major contributions based on the Schnieper model include [3] and [2]. In the former, the author derives an estimator for the mean square error of the reserves. In the latter, they proposed a non-parametric bootstrap procedure to estimate the distribution of the reserve. Additionally, some valuable insights inspired by Schnieper’s model are found in [5] and [6]. In the former, the author uses Schnieper’s approach to determine the implicit part of the IBNER in the Chain Ladder reserve. In the latter, they adapt the methodology in order to also separate the paid from the incurred claims.

Schnieper also addressed a special case: claim numbering above a priority. The frequency of claims exceeding the priority over time is divided into: claims that newly reach the priority and claims that fall below it. In particular, he proposed assuming a Poisson distribution for claims that reach the priority and a Binomial distribution for claims that drop below it.

In this paper, we focus on frequency estimation in presence of incomplete data, specifically the reported incurred above a priority, using Schnieper’ framework for claim numbering. We show that the total number of claims in his model follows a Poisson distribution at each year of development. Consequently, this framework is consistent with a Poisson distribution for the total number of claims above a priority and provides a consistent framework for parameter estimation. We also propose an alternative assumption based on a Negative binomial distribution, which yields similar results. We show that the total number of claims also follows a Negative binomial distribution for each year of development and we provide an estimation procedure. Additionally, we address claim reserving by providing the distribution of ultimate claim numbers, conditioned on current incurred claims.

The paper is organized as follows. Section 2 presents Schnieper’s general model, with a review of key estimators. Section 3 covers claim numbers above a priority. The first part deals with the Schnieper assumption, from which we derive additional results. Specifically, we obtain the distribution of the total number of claims for both purposes: quotation and reserving. The second part presents an alternative assumption under which we show that the total claim number above a priority follows a Negative binomial distribution. Section 4 describes a bootstrap procedure for each case, addressing uncertainty in parameter estimation. Finally, Section 5 provides a case study comparing assumptions and evaluating the impact of the bootstrap approach and its contribution to the different assumptions.

2 The general model

The Schnieper model, with an aim of excess of loss cover, separates two different behaviors in the IBNR data:

  • The occurrence of newly reported claims, which are assumed to arise randomly based on the level of exposure ;

  • The progression of previously reported claims, which is determined by the current known amounts.

The Schnieper’s framework requires more summary statistics than the aggregated evolution of the incurred claims, which we introduce below, where n1n\geq 1 denotes the number of years, 1in1\leq i\leq n represents the occurrence year, and 1jn1\leq j\leq n represents the development year:

  • The random variables (Ni,j)1i,jn(N_{i,j})_{1\leq i,j\leq n} represent the total amount of new excess claims, referring to claims that have not been recorded as excess claims in previous development years ;

  • The random variables (Di,j)1i,jn(D_{i,j})_{1\leq i,j\leq n} represent the decrease in the total claims amount between development years j1j-1 and jj, concerning claims that were already known in development year j1j-1.

The (Di,j)1i,jn(D_{i,j})_{1\leq i,j\leq n} can be negative in the event of an increase and, by construction, Di,1=0D_{i,1}=0 for all 1in1\leq i\leq n.


Given (Ni,j)1i,jn(N_{i,j})_{1\leq i,j\leq n} and (Di,j)1i,jn(D_{i,j})_{1\leq i,j\leq n}, the cumulative incurred data (Ci,j)1i,jn(C_{i,j})_{1\leq i,j\leq n} can be calculated using the following iterative process:

Ci,1=Ni,1,\displaystyle C_{i,1}=N_{i,1}, 1in\displaystyle 1\leq i\leq n (1)
Ci,j+1=Ci,j+Ni,j+1Di,j+1,\displaystyle C_{i,j+1}=C_{i,j}+N_{i,j+1}-D_{i,j+1}, 1in, 1jn1\displaystyle 1\leq i\leq n,\ 1\leq j\leq n-1

We also introduce non-negative exposures (Ei)1in(E_{i})_{1\leq i\leq n} that are assumed to be known and associated with the data mentioned above. Finally, we introduce the following filtration:

k:=σ(Ni,j,Di,ji+jk+1),k1.\mathcal{F}_{k}:=\sigma(N_{i,j},D_{i,j}\mid i+j\leq k+1),\ \ k\geq 1.

The current available information is n\mathcal{F}_{n}. In the context of Schnieper’s general model, the following assumption is made:

Assumption 2.1.
  • H1

    The random variables (Ni1,j,Di1,j)1jn(N_{i_{1},j},D_{i_{1},j})_{1\leq j\leq n} and (Ni2,j,Di2,j)1jn(N_{i_{2},j},D_{i_{2},j})_{1\leq j\leq n} are independent for i1i2i_{1}\not=i_{2}.

  • H2

    For 1jn1\leq j\leq n, there exists λj0\lambda_{j}\geq 0 and for 1jn11\leq j\leq n-1, there exists δj1\delta_{j}\leq 1 such that

    𝔼(Ni,ji+j2)\displaystyle\mathbb{E}(N_{i,j}\mid\mathcal{F}_{i+j-2}) =λjEi,\displaystyle=\lambda_{j}E_{i},\ \ 1in,\displaystyle 1\leq i\leq n,
    𝔼(Di,j+1i+j1)\displaystyle\mathbb{E}(D_{i,j+1}\mid\mathcal{F}_{i+j-1}) =δjCi,j,\displaystyle=\delta_{j}C_{i,j},\ \ 1in.\displaystyle 1\leq i\leq n.
  • H3

    For 1jn11\leq j\leq n-1, there exist σj20\sigma_{j}^{2}\geq 0 and τj20\tau_{j}^{2}\geq 0 such that

    Var(Ni,ji+j2)\displaystyle Var(N_{i,j}\mid\mathcal{F}_{i+j-2}) =σj2Ei,\displaystyle=\sigma_{j}^{2}E_{i},\ \ 1in,\displaystyle 1\leq i\leq n,
    Var(Di,j+1i+j1)\displaystyle Var(D_{i,j+1}\mid\mathcal{F}_{i+j-1}) =τj2Ci,j,\displaystyle=\tau_{j}^{2}C_{i,j},\ \ 1in.\displaystyle 1\leq i\leq n.

The evolution of the (Di,j)1i,jn(D_{i,j})_{1\leq i,j\leq n} follows the same process as in Mack’s model [4], using incurred claims as the exposure combined with a development factor. However, the new claims generated by the (Ni,j)1i,jn(N_{i,j})_{1\leq i,j\leq n} represent an additional additive component that depends on the exposure.

From the above assumption, Schnieper introduced the following estimators for the λ\lambda’s and the δ\delta’s.

λ^j\displaystyle\widehat{\lambda}_{j} :=i=1nj+1Ni,ji=1nj+1Ei,\displaystyle:=\frac{\sum_{i=1}^{n-j+1}N_{i,j}}{\sum_{i=1}^{n-j+1}E_{i}}, (2)
δ^j\displaystyle\widehat{\delta}_{j} :=i=1njDi,j+1i=1njCi,j.\displaystyle:=\frac{\sum_{i=1}^{n-j}D_{i,j+1}}{\sum_{i=1}^{n-j}C_{i,j}}.

They are obviously biasfree estimates of the λ\lambda’s and δ\delta’s respectively. Additionally, they are the best linear estimators in Ni,jEi\frac{N_{i,j}}{E_{i}} and Di,j+1Ci,j\frac{D_{i,j+1}}{C_{i,j}} respectively as a consequence of H3.

Schnieper developed a way to estimate the expected value of Cn+1,nC_{n+1,n} and the pure premium for the following year, including the total reserve along with its associated mean square error based on estimators of the σ\sigma’s and τ\tau’s that he provides. We now present the model for claim number, which can be considered a specific instance of the broader model.

3 The model for claim numbers

Schnieper also dealt with a special case: the number of claims above a priority, which is the focus of this paper. From this point onward, for a given priority level, we define

  • The random variables (Ni,j)1i,jn(N_{i,j})_{1\leq i,j\leq n} represent the number of new excess claims pertaining to accident year ii in development year jj (were below the priority or not reported in development year j1j-1) ;

  • The random variables (Di,j)1i,jn(D_{i,j})_{1\leq i,j\leq n} represent the number of claims that exceeded the priority in development year j1j-1 but have since decreased in cost to fall below the priority in development year jj.

In this context, the (Di,j)(D_{i,j}) are now non-negative and remain bounded by (Ci,j1)(C_{i,j-1}). Additionally, all data are integer.


Schnieper proposed that the new claims, denoted as (Ni,j)(N_{i,j}), follow a Poisson distribution, while the claims decreasing below the priority, represented by (Di,j)(D_{i,j}), follow a Binomial distribution. The reasoning is as follows: claims that rise above the priority (either new or previously below the threshold) occur independently at non random intensity, aligning with a Poisson distribution. Meanwhile, each claim above the priority has a certain probability of falling below the threshold each year, occurring independently and leading to a binomial distribution. The next subsection will provide a detailed explanation of these assumptions and additional results as stated in [7], including the finding that (Ci,j)(C_{i,j}) follows a Poisson distribution at each date.

3.1 The Poisson assumption

The following assumption corresponds to [7, Assumptions A1′′A2′′A_{1}^{\prime\prime}-A_{2}^{\prime\prime}]. We add H1’ to slighty reinforce H1.

Assumption 3.1.
  • H1’

    The random variables (Ni1,j,Di1,j)1jn(N_{i_{1},j},D_{i_{1},j})_{1\leq j\leq n} and (Ni2,j,Di2,j)1jn(N_{i_{2},j},D_{i_{2},j})_{1\leq j\leq n} are independent for i1i2i_{1}\not=i_{2}, and the pairs (Ni,j,Di,j)(N_{i,j},D_{i,j}) are independent for all 1i,jn1\leq i,j\leq n.

  • H2’

    For 1jn1\leq j\leq n, there exists λj0\lambda_{j}\geq 0 and for 1jn11\leq j\leq n-1, there exists 0δj10\leq\delta_{j}\leq 1 such that

    Ni,ji+j2\displaystyle N_{i,j}\mid\mathcal{F}_{i+j-2} 𝒫(λjEi),\displaystyle\sim\mathcal{P}\left(\lambda_{j}E_{i}\right),\quad 1in,\displaystyle 1\leq i\leq n,
    Di,j+1i+j1\displaystyle D_{i,j+1}\mid\mathcal{F}_{i+j-1} (Ci,j,δj),\displaystyle\sim\mathcal{B}(C_{i,j},\delta_{j}),\quad 1in.\displaystyle 1\leq i\leq n.

Note that H1’ implies H1, and H2’ implies H2 and H3 from Section 2. H1’ adds independance between Ni,jN_{i,j} and Di,jD_{i,j} on the same dates, while H2’ specifies the distributions. We assume now that the assumptions H1’ and H2’ hold. Given these, [7] showed that the λ^\widehat{\lambda}’s and δ^\widehat{\delta}’s defined in (2) are also the maximum likelihood estimators and are efficient, with their variances given by the inverse of the Fisher information.

From what Schnieper stated in this framework, we shall show additional results. Specifically, under H1’ and H2’, the (Ci,j)1i,jn(C_{i,j})_{1\leq i,j\leq n} follow a Poisson distribution. Before stating the result, recall first a classic lemma that will be essential.

Lemma 3.1.

Let NN be a random variable with distribution 𝒫(λ)\mathcal{P}(\lambda) with λ>0\lambda>0 and DD a random variable such that DN(N,p)D\mid N\sim\mathcal{B}(N,p) for 0<p<10<p<1. Then

ND𝒫(λ(1p)).N-D\sim\mathcal{P}(\lambda(1-p)).
Proof.

For completeness, we provide the proof of this elementary result . Let kk\in\mathbb{N}.

(ND=k)\displaystyle\mathbb{P}(N-D=k) =n({D=nk}{N=n})(N=n)\displaystyle=\sum_{n\in\mathbb{N}}\mathbb{P}\left(\left\{D=n-k\right\}\mid\left\{N=n\right\}\right)\mathbb{P}\left(N=n\right)
=nkn!k!(nk)!pnk(1p)keλλnn!\displaystyle=\sum_{n\geq k}\frac{n!}{k!(n-k)!}p^{n-k}(1-p)^{k}\frac{e^{-\lambda}\lambda^{n}}{n!}
=eλ[λ(1p)]kk!n0(pλ)nn!\displaystyle=\frac{e^{-\lambda}\left[\lambda(1-p)\right]^{k}}{k!}\sum_{n\geq 0}\frac{(p\lambda)^{n}}{n!}
=eλ(1p)[λ(1p)]kk!.\displaystyle=\frac{e^{-\lambda(1-p)}\left[\lambda(1-p)\right]^{k}}{k!}.

Proposition 3.2.

For all 1i,jn1\leq i,j\leq n, we have

Ci,j𝒫(λjEi),C_{i,j}\sim\mathcal{P}\left(\lambda_{j}^{\prime}E_{i}\right),

with

λj:=k=1jλk(=kj1(1δ)).\lambda_{j}^{\prime}:=\sum_{k=1}^{j}\lambda_{k}\left(\prod_{\ell=k}^{j-1}(1-\delta_{\ell})\right).
Proof.

We prove the lemma by induction. Let 1in1\leq i\leq n be fixed. By construction Ci,1=Ni,1𝒫(λ1Ei)C_{i,1}=N_{i,1}\sim\mathcal{P}(\lambda_{1}E_{i}). Assume as the induction hypothesis that Ci,jC_{i,j} follows a Poisson distribution with parameter λjEi\lambda_{j}^{\prime}E_{i}. Recall the relation in (1):

Ci,j+1=Ci,j+Ni,j+1Di,j+1.C_{i,j+1}=C_{i,j}+N_{i,j+1}-D_{i,j+1}.

Under H2’, by Lemma 3.1,

Ci,jDi,j+1𝒫(λjEi(1δj)),C_{i,j}-D_{i,j+1}\sim\mathcal{P}\left(\lambda_{j}^{\prime}E_{i}(1-\delta_{j})\right),

and finally, under H1’:

Ci,j+1𝒫(λj+1Ei+λj(1δj)Ei)=𝒫(λj+1Ei).C_{i,j+1}\sim\mathcal{P}\left(\lambda_{j+1}E_{i}+\lambda_{j}^{\prime}(1-\delta_{j})E_{i}\right)=\mathcal{P}\left(\lambda_{j+1}^{\prime}E_{i}\right).

The above result provides the distribution of Cn+1,nC_{n+1,n} given the corresponding exposure. In practice, we are also interested in the conditional distribution Ci,jnC_{i,j}\mid\mathcal{F}_{n} for i+j>n+1i+j>n+1. Before presenting this result, we introduce an essential lemma.

Lemma 3.3.

Let NN\in\mathbb{N}^{*}. Let D1(N,p1)D_{1}\sim\mathcal{B}(N,p_{1}), and Dn+1{i=1nDi=d}(Nd,pn+1)D_{n+1}\mid\{\sum_{i=1}^{n}D_{i}=d\}\sim\mathcal{B}(N-d,p_{n+1}) for n1n\geq 1. Then

Ni=1nDi(N,i=1n(1pi)).N-\sum_{i=1}^{n}D_{i}\sim\mathcal{B}\left(N,\prod_{i=1}^{n}(1-p_{i})\right).
Proof.

We prove the lemma by induction. It is clear that ND1(N,1p1)N-D_{1}\sim\mathcal{B}(N,1-p_{1}). Let n1n\geq 1, and assume by induction that

Ni=1nDi(N,i=1n(1pi)).N-\sum_{i=1}^{n}D_{i}\sim\mathcal{B}\left(N,\prod_{i=1}^{n}(1-p_{i})\right).

Let k{0,,N}k\in\{0,\ldots,N\}. For ease of notation, we introduce 1πn:=i=1n(1pi)1-\pi_{n}:=\prod_{i=1}^{n}(1-p_{i}). It follows that

[Ni=1n+1Di=k]\displaystyle\mathbb{P}\left[N-\sum_{i=1}^{n+1}D_{i}=k\right] =d=0N[Ni=1n+1Di=k|i=1nDi=d](i=1nDi=d)\displaystyle=\sum_{d=0}^{N}\mathbb{P}\left[N-\sum_{i=1}^{n+1}D_{i}=k\ \middle|\ \sum_{i=1}^{n}D_{i}=d\right]\mathbb{P}\left(\sum_{i=1}^{n}D_{i}=d\right)
=d=0N[Dn+1=Ndk|i=1nDi=d](Ni=1nDi=Nd)\displaystyle=\sum_{d=0}^{N}\mathbb{P}\left[D_{n+1}=N-d-k\ \middle|\ \sum_{i=1}^{n}D_{i}=d\right]\mathbb{P}\left(N-\sum_{i=1}^{n}D_{i}=N-d\right)
=d=0Nk(Nd)!k!(Ndk)!pn+1Ndk(1pn+1)kN!d!(Nd)!πnd(1πn)Nd\displaystyle=\sum_{d=0}^{N-k}\frac{(N-d)!}{k!(N-d-k)!}p_{n+1}^{N-d-k}(1-p_{n+1})^{k}\frac{N!}{d!(N-d)!}\pi_{n}^{d}(1-\pi_{n})^{N-d}
=N!k!(Nk)!(1πn+1)kd=0Nk(Nk)!(Ndk)!d!pn+1Ndkπnd(1πn)Ndk\displaystyle=\frac{N!}{k!(N-k)!}(1-\pi_{n+1})^{k}\sum_{d=0}^{N-k}\frac{(N-k)!}{(N-d-k)!d!}p_{n+1}^{N-d-k}\pi_{n}^{d}(1-\pi_{n})^{N-d-k}
=N!k!(Nk)!(1πn+1)k(πn+pn+1(1πn))Nk\displaystyle=\frac{N!}{k!(N-k)!}(1-\pi_{n+1})^{k}\left(\pi_{n}+p_{n+1}(1-\pi_{n})\right)^{N-k}
=N!k!(Nk)!(1πn+1)kπn+1Nk.\displaystyle=\frac{N!}{k!(N-k)!}(1-\pi_{n+1})^{k}\pi_{n+1}^{N-k}.

Proposition 3.4.

For all 1i,jn1\leq i,j\leq n such that i+j>n+1i+j>n+1, we have

Ci,jn(Ci,ni+1,δi,j)+𝒫(λi,jEi),C_{i,j}\mid\mathcal{F}_{n}\sim\mathcal{B}(C_{i,n-i+1},\delta_{i,j}^{\prime})+\mathcal{P}\left(\lambda_{i,j}^{\prime}E_{i}\right),

where the right-hand side should be interpreted as the sum of two independent random variables, and with

δi,j\displaystyle\delta_{i,j}^{\prime} :=k=ni+1j1(1δk),\displaystyle:=\prod_{k=n-i+1}^{j-1}(1-\delta_{k}),
λi,j\displaystyle\lambda_{i,j}^{\prime} :=k=ni+2jλk(=kj1(1δ)).\displaystyle:=\sum_{k=n-i+2}^{j}\lambda_{k}\left(\prod_{\ell=k}^{j-1}(1-\delta_{\ell})\right).
Proof.

We prove the lemma by induction. Let 1in1\leq i\leq n be fixed. Since:

Ci,ni+2=Ci,ni+1Di,ni+2+Ni,ni+2,C_{i,n-i+2}=C_{i,n-i+1}-D_{i,n-i+2}+N_{i,n-i+2},

from Lemma 3.3, Ci,ni+1Di,ni+2n(Ci,ni+1,1δi,ni+1)C_{i,n-i+1}-D_{i,n-i+2}\mid\mathcal{F}_{n}\sim\mathcal{B}(C_{i,n-i+1},1-\delta_{i,n-i+1}) and Ni,ni+2n𝒫(λni+2Ei)N_{i,n-i+2}\mid\mathcal{F}_{n}\sim\mathcal{P}(\lambda_{n-i+2}E_{i}). Assume by induction that:

Ci,jn(Ci,ni+1,δi,j)+𝒫(λi,jEi).C_{i,j}\mid\mathcal{F}_{n}\sim\mathcal{B}\left(C_{i,n-i+1},\delta_{i,j}^{\prime}\right)+\mathcal{P}\left(\lambda_{i,j}^{\prime}E_{i}\right).

Remark that

Ci,j+1=Ci,ni+1+(Ci,jCi,ni+1)+Ni,j+1Di,j+1.C_{i,j+1}=C_{i,n-i+1}+(C_{i,j}-C_{i,n-i+1})+N_{i,j+1}-D_{i,j+1}.

and Di,j+1(Ci,j,δj)=(Ci,ni+1,δj)+(Ci,jCi,ni+1,δj)D_{i,j+1}\sim\mathcal{B}(C_{i,j},\delta_{j})=\mathcal{B}(C_{i,n-i+1},\delta_{j})+\mathcal{B}(C_{i,j}-C_{i,n-i+1},\delta_{j}), where the right-hand side should be interpreted as the sum of two independent random variables.
Under H2’, by Lemma 3.3,

Ci,ni+1(Ci,ni+1,δj)n(Ci,ni+1,δi,j+1),C_{i,n-i+1}-\mathcal{B}(C_{i,n-i+1},\delta_{j})\mid\mathcal{F}_{n}\sim\mathcal{B}\left(C_{i,n-i+1},\delta_{i,j+1}^{\prime}\right),

and

(Ci,jCi,ni+1)(Ci,jCi,ni+1,δj)n𝒫(λi,jEi(1δj))(C_{i,j}-C_{i,n-i+1})-\mathcal{B}(C_{i,j}-C_{i,n-i+1},\delta_{j})\mid\mathcal{F}_{n}\sim\mathcal{P}\left(\lambda_{i,j}^{\prime}E_{i}(1-\delta_{j}\right))

and finally, by Lemma 3.1

Ci,j+1n(Ci,ni+1,δi,j+1)+𝒫(λi,j+1Ei).C_{i,j+1}\mid\mathcal{F}_{n}\sim\mathcal{B}\left(C_{i,n-i+1},\delta_{i,j+1}^{\prime}\right)+\mathcal{P}\left(\lambda_{i,j+1}^{\prime}E_{i}\right).

The result indicates that, following the observation of n\mathcal{F}_{n}, the distribution of an unobserved Ci,jC_{i,j} can be described as the sum of two components: current claims that exceed the priority threshold and are likely to stay above it, and new claims that may initially rise above the priority but might later fall below it.

If we are interested in estimating 𝔼(Ci,nn)\mathbb{E}(C_{i,n}\mid\mathcal{F}_{n}), we can use the estimator C^i,n:=λ^i,nEi+δ^i,nCi,ni+1\widehat{C}_{i,n}:=\widehat{\lambda}_{i,n}^{\prime}E_{i}+\widehat{\delta}_{i,n}^{\prime}C_{i,n-i+1} where (λ^i,n,δ^i,n)(\widehat{\lambda}_{i,n}^{\prime},\widehat{\delta}_{i,n}^{\prime}) are estimators for (λi,n,δi,n)(\lambda_{i,n}^{\prime},\delta_{i,n}^{\prime}). The sequences (λk)(\lambda_{k}) and (δk)(\delta_{k}) are estimated by (λ^k)(\widehat{\lambda}_{k}) and (δ^k)(\widehat{\delta}_{k}), as defined in (2).


In practice, the Poisson distribution is commonly used for modeling the number of claims due to its simplicity and the assumption of independent arrivals with non-random intensity across all claims. However, it is often observed that the empirical variance exceeds the empirical mean, suggesting that the claim data might not fully adhere to the assumptions of the Poisson distribution.

For instance, if the intensity parameter of claim arrivals for each policy follows a Gamma distribution, the total number of claims is known to follow a Negative binomial distribution. This distribution is favored because it remains straightforward to use, accommodates excess variance, and converges to the Poisson distribution when this variance diminishes.

In the following subsection, we propose using the Negative binomial distribution instead of the Poisson, demonstrating that, with an appropriate assumption, it can yield comparable results.

3.2 The Negative binomial assumption

In this section, we aim to establish a Negative binomial framework that yields results similar to those obtained from the previous Poisson framework. Specifically, we define the Negative binomial distribution as follows: let N𝒩(r,p)N\sim\mathcal{NB}(r,p) where r>0r>0 and 0<p<10<p<1. The probability mass function of NN is given by

(N=n)=Γ(r+n)n!Γ(r)pr(1p)n,n.\mathbb{P}(N=n)=\frac{\Gamma(r+n)}{n!\Gamma(r)}p^{r}(1-p)^{n},\quad n\in\mathbb{N}. (3)

We then proceed by introducing a new assumption that serves as a replacement for H2’. The change of this assumption has an impact only on the sequence of random variables (Ni,j)1i,jn(N_{i,j})_{1\leq i,j\leq n}.

Assumption 3.2.
  • H2”

    For 1jn1\leq j\leq n, there exists rj0r_{j}\geq 0 and for 1jn11\leq j\leq n-1, there exists 0δj10\leq\delta_{j}\leq 1 such that

    Ni,ji+j2\displaystyle N_{i,j}\mid\mathcal{F}_{i+j-2} 𝒩(rjEi,pj) in which pj+1:=pj1δj(1pj),\displaystyle\sim\mathcal{NB}\left(r_{j}E_{i},p_{j}\right)\text{ in which }p_{j+1}:=\frac{p_{j}}{1-\delta_{j}(1-p_{j})},\ \ 1in,\displaystyle 1\leq i\leq n,
    Di,j+1i+j1\displaystyle D_{i,j+1}\mid\mathcal{F}_{i+j-1} (Ci,j,δj),\displaystyle\sim\mathcal{B}(C_{i,j},\delta_{j}),\ \ 1in.\displaystyle 1\leq i\leq n.

Similarly, H2" implies H2 and H3 from Section 2. The structure of the pp’s may not initially appear clear or intuitive. However, the representation of these parameters will be clarified later. The free parameters are the (rj)1jn(r_{j})_{1\leq j\leq n} (which replace the λ\lambda’s from H2’), the (δj)1jn1(\delta_{j})_{1\leq j\leq n-1} and p1]0,1[p_{1}\in]0,1[. There is only one additional parameter, compared to H2’. This extra parameter governs the additional variance due to the specific configuration of the family (pj)1jn(p_{j})_{1\leq j\leq n}.

Remark 3.5.

The pp’s can be explicitly expressed in terms of p1p_{1} and the δ\delta’s:

pj=p11(1p1)[k=1j1δk=1k1(1δ)], 1jn.p_{j}=\frac{p_{1}}{1-(1-p_{1})\left[\sum_{k=1}^{j-1}\delta_{k}\prod_{\ell=1}^{k-1}(1-\delta_{\ell})\right]},\ 1\leq j\leq n.

The above remark can be verified through direct induction. Estimating the rr’s, δ\delta’s, and p1p_{1} cannot yield explicit maximum likelihood estimates. For δ^\widehat{\delta}’s, we can use the same estimator as the one defined in the Poisson framework in (2). When p1p_{1} is known, to estimate the rr’s, we can use a moment estimator based on the expected value and set:

r^j:=λ^jpj1pj, 1jn.\widehat{r}_{j}:=\widehat{\lambda}_{j}\frac{p_{j}}{1-p_{j}},\ \ 1\leq j\leq n. (4)

Finally, the estimation of p1p_{1} can be computed numerically using maximum likelihood methods:

p^1argmaxj=1ni=1nj+1logf(Ni,j,r^jEi,pj),\widehat{p}_{1}\in\operatorname*{arg\,max}\sum_{j=1}^{n}\sum_{i=1}^{n-j+1}\log f(N_{i,j},\widehat{r}_{j}E_{i},p_{j}), (5)

in which (n,r,p)f(n,r,p)(n,r,p)\mapsto f(n,r,p) denotes the probability mass function of the Negative binomial distribution with parameter (r,p)(r,p), see (3).

It remains to explain why we choose this form for the pp’s. It is a consequence of the following result, in order to have a consistent form, as in the Poisson case. To establish the distribution of the (Ci,j)(C_{i,j}), we begin with a classic lemma.

Lemma 3.6.

Let NN be a random variable with distribution 𝒩(r,p)\mathcal{NB}(r,p) for r>0r>0 and 0<p<10<p<1, and DD a random variable such that DN(N,δ)D\mid N\sim\mathcal{B}(N,\delta) for 0<δ<10<\delta<1. Then

ND𝒩(r,p) with p:=p1δ(1p).N-D\sim\mathcal{NB}(r,p^{\prime})\ \text{ with }\ p^{\prime}:=\frac{p}{1-\delta(1-p)}.
Proof.

Let kk\in\mathbb{N}.

(ND=k)\displaystyle\mathbb{P}(N-D=k) =n({D=nk}{N=n})(N=n)\displaystyle=\sum_{n\in\mathbb{N}}\mathbb{P}\left(\left\{D=n-k\right\}\mid\left\{N=n\right\}\right)\mathbb{P}\left(N=n\right)
=nkn!k!(nk)!δnk(1δ)kΓ(r+n)n!Γ(r)pr(1p)n\displaystyle=\sum_{n\geq k}\frac{n!}{k!(n-k)!}\delta^{n-k}(1-\delta)^{k}\frac{\Gamma(r+n)}{n!\Gamma(r)}p^{r}(1-p)^{n}
=(1δ)kk!prn0δnn!Γ(r+n+k)Γ(r)(1p)n+k\displaystyle=\frac{(1-\delta)^{k}}{k!}p^{r}\sum_{n\geq 0}\frac{\delta^{n}}{n!}\frac{\Gamma(r+n+k)}{\Gamma(r)}(1-p)^{n+k}
=[(1δ)(1p)]kk!prn0Γ(r+n+k)n!Γ(r)[δ(1p)]n\displaystyle=\frac{[(1-\delta)(1-p)]^{k}}{k!}p^{r}\sum_{n\geq 0}\frac{\Gamma(r+n+k)}{n!\Gamma(r)}[\delta(1-p)]^{n}
=Γ(r+k)k!Γ(r)[(1δ)(1p)1δ(1p)]k[p1δ(1p)]r.\displaystyle=\frac{\Gamma(r+k)}{k!\Gamma(r)}\left[\frac{(1-\delta)(1-p)}{1-\delta(1-p)}\right]^{k}\left[\frac{p}{1-\delta(1-p)}\right]^{r}.

The final line is obtained by noting that n0Γ(r+k+n)n!Γ(r+k)[1δ(1p)]r+k[δ(1p)]n=1\sum_{n\geq 0}\frac{\Gamma(r+k+n)}{n!\Gamma(r+k)}[1-\delta(1-p)]^{r+k}[\delta(1-p)]^{n}=1. ∎

Proposition 3.7.

For all 1i,jn1\leq i,j\leq n, we have

Ci,j𝒩(rjEi,pj),C_{i,j}\sim\mathcal{NB}\left(r_{j}^{\prime}E_{i},p_{j}\right),

with

rj:=k=1jrk.r_{j}^{\prime}:=\sum_{k=1}^{j}r_{k}.
Proof.

We prove the lemma by induction. Let 1in1\leq i\leq n be fixed. By construction Ci,1=Ni,1𝒩(r1Ei,p1)C_{i,1}=N_{i,1}\sim\mathcal{NB}(r_{1}E_{i},p_{1}). Assume by induction that Ci,jC_{i,j} follows a Negative binomial distribution with parameter (rjEi,pj)(r_{j}^{\prime}E_{i},p_{j}). Recall that

Ci,j+1=Ci,j+Ni,j+1Di,j+1.C_{i,j+1}=C_{i,j}+N_{i,j+1}-D_{i,j+1}.

Under H2”, by Lemma 3.6,

Ci,jDi,j+1𝒩(rjEi,pj+1),C_{i,j}-D_{i,j+1}\sim\mathcal{NB}\left(r_{j}^{\prime}E_{i},p_{j+1}\right),

and finally:

Ci,j+1𝒩(rj+1Ei,pj+1).C_{i,j+1}\sim\mathcal{NB}\left(r_{j+1}^{\prime}E_{i},p_{j+1}\right).

The form of the pp’s can now be understood. Assuming new claims above the priority threshold follow a Negative binomial distribution and that some claims may later fall below the priority, we aim to maintain consistency at any point in time with the Negative binomial distribution. Consequently, this leads to the specific form of the pp’s.

The pp’s are increasing, leading to a smaller excess of variance over time. Notably, the likelihood of claims dropping below the priority with probabilities δ\delta’s influences the Negative binomial distribution, including claims not yet reported in future development years. And more likely the claim are droping below the thereshold, faster decreases the excess of variance for the new claims.

In particular, for the extremal cases, if δj=0\delta_{j}=0, meaning that no claims drops below the priority, the excess of variance does not reduce. Conversely, when δj1\delta_{j}\rightarrow 1, meaning all claims drop below the priority, the excess variance vanishes.

The preceding result provides the distribution of Cn+1,nC_{n+1,n} given the related exposure. Additionally, we may be interested in the distribution of Ci,jnC_{i,j}\mid\mathcal{F}_{n} for i+j>n+1i+j>n+1, as described in the following proposition.

Proposition 3.8.

For all 1i,jn1\leq i,j\leq n such that i+j>n+1i+j>n+1, we have

Ci,jn(Ci,ni+1,δi,j)+𝒩(rjEi,pj),C_{i,j}\mid\mathcal{F}_{n}\sim\mathcal{B}(C_{i,n-i+1},\delta_{i,j}^{\prime})+\mathcal{NB}\left(r_{j}^{\prime}E_{i},p_{j}\right),

where the right-hand side should be interpreted as the sum of two independent random variables, and with

δi,j\displaystyle\delta_{i,j}^{\prime} :=k=ni+1j1(1δk),\displaystyle:=\prod_{k=n-i+1}^{j-1}(1-\delta_{k}),
rj\displaystyle r_{j}^{\prime} :=k=ni+2jrk.\displaystyle:=\sum_{k=n-i+2}^{j}r_{k}.
Proof.

The proof follows the same reasoning as in Proposition 3.4, with the key difference being the application of Lemma 3.6 in place of Lemma 3.1. ∎

If we are interested in estimating 𝔼(Ci,nn)\mathbb{E}(C_{i,n}\mid\mathcal{F}_{n}), using (4) for the estimators of the rr’s leads to the same estimator as the one suggested for Proposition 3.4.

4 Bootstrap methodology

In [2], the author discusses a bootstrap methodology for the general Schnieper model to resimulate the λ\lambda’s and δ\delta’s, accounting for uncertainty in the parameters. They also simulate a Gaussian random variable to integrate the internal randomness of the process for each development stage. This follows the main ideas of the non-parametric bootstrap as summarized in [1].

Here, we present a distinct approach that utilizes the specific framework of claim numbers and proposes a comprehensive parametric bootstrap methodology without computing residuals or making any additional assumption.

4.1 The Poisson case

Let MM\in\mathbb{N}^{*} be the total number of bootstrap simulations to be performed. To account for the inherent randomness of the λ\lambda’s and δ\delta’s, for 1mM1\leq m\leq M. To do it efficiently, we shall use the following lemma.

Lemma 4.1.

Under H1 and H2’, we have:

λ^jj1\displaystyle\widehat{\lambda}_{j}\mid\mathcal{B}_{j-1} 𝒫(λji=1nj+1Ei)i=1nj+1Ei,\displaystyle\sim\frac{\mathcal{P}\left(\lambda_{j}\sum_{i=1}^{n-j+1}E_{i}\right)}{\sum_{i=1}^{n-j+1}E_{i}},\ \ 1jn,\displaystyle 1\leq j\leq n,
δ^jj\displaystyle\widehat{\delta}_{j}\mid\mathcal{B}_{j} (i=1njCi,j,δj)i=1njCi,j,\displaystyle\sim\frac{\mathcal{B}\left(\sum_{i=1}^{n-j}C_{i,j},\delta_{j}\right)}{\sum_{i=1}^{n-j}C_{i,j}},\ \ 1jn1,\displaystyle 1\leq j\leq n-1,

where k:=σ(Ni,j,Di,ji+jn+1,jk)\mathcal{B}_{k}:=\sigma(N_{i,j},D_{i,j}\mid i+j\leq n+1,j\leq k).

Proof.

Direct consequence of H1 and H2’. ∎

This provides a direct method to simulate the bootstrapped λ\lambda’s and δ\delta’s.

(λ^jm)1mM\displaystyle(\widehat{\lambda}_{j}^{m})_{1\leq m\leq M} i.i.d.𝒫(λ^ji=1nj+1Ei)i=1nj+1Ei,\displaystyle\overset{i.i.d.}{\sim}\frac{\mathcal{P}\left(\widehat{\lambda}_{j}\sum_{i=1}^{n-j+1}E_{i}\right)}{\sum_{i=1}^{n-j+1}E_{i}},\ \ 1jn,\displaystyle 1\leq j\leq n,
(δ^jm)1mM\displaystyle(\widehat{\delta}_{j}^{m})_{1\leq m\leq M} i.i.d.(i=1njCi,j,δ^j)i=1njCi,j,\displaystyle\overset{i.i.d.}{\sim}\frac{\mathcal{B}\left(\sum_{i=1}^{n-j}C_{i,j},\widehat{\delta}_{j}\right)}{\sum_{i=1}^{n-j}C_{i,j}},\ \ 1jn1,\displaystyle 1\leq j\leq n-1,

where (λ^j)1jn(\widehat{\lambda}_{j})_{1\leq j\leq n} and (δ^j)1jn1(\widehat{\delta}_{j})_{1\leq j\leq n-1} come from (2)\eqref{main_estimators} and the CC’s are the observed data.

Bootstrap simulation of Cn+1,nC_{n+1,n}.

Following Proposition 3.2, the bootstrap simulation is performed as follows:

Cn+1,nm𝒫(λ^nmEn+1), 1mM.C_{n+1,n}^{m}\sim\mathcal{P}(\widehat{\lambda}_{n}^{\prime m}E_{n+1}),\ \ 1\leq m\leq M.

Bootstrap simulation of Ci,jnC_{i,j}\mid\mathcal{F}_{n}.

For the lower triangle, the simulation is conducted using:

Ci,ni+1m\displaystyle C_{i,n-i+1}^{m} :=Ci,ni+1,\displaystyle:=C_{i,n-i+1},\ \ 1in,\displaystyle 1\leq i\leq n,
Ci,j+1m\displaystyle C_{i,j+1}^{m} :=Ci,jm+𝒫(λ^j+1mEi)(Ci,jm,δ^jm),\displaystyle:=C_{i,j}^{m}+\mathcal{P}\left(\widehat{\lambda}_{j+1}^{m}E_{i}\right)-\mathcal{B}\left(C_{i,j}^{m},\widehat{\delta}_{j}^{m}\right),\ \ ni+1jn1.\displaystyle n-i+1\leq j\leq n-1.

On the right-hand side of the last equality, the difference of the two distributions should be interpreted as the difference of two independent random variables. This procedure generates a bootstrap distribution for the random variable Ci,jnC_{i,j}\mid\mathcal{F}_{n} on the lower triangle. In this process, the uncertainty associated with the estimators of the parameters is integrated.

Remark 4.2.

Based on Proposition 3.4, when our focus is on Ci,nnC_{i,n}\mid\mathcal{F}_{n}, we can efficiently simulate Ci,nmC_{i,n}^{m}, using

Ci,nm(Ci,ni+1,δ^i,nm)+𝒫(λ^i,nmEi),C_{i,n}^{m}\sim\mathcal{B}(C_{i,n-i+1},\widehat{\delta}_{i,n}^{\prime m})+\mathcal{P}\left(\widehat{\lambda}_{i,n}^{\prime m}E_{i}\right),

where the right-hand side should be interpreted as the sum of two independent random variables, and with

δ^i,nm\displaystyle\widehat{\delta}_{i,n}^{\prime m} :=k=ni+1n1(1δ^km),\displaystyle:=\prod_{k=n-i+1}^{n-1}(1-\widehat{\delta}_{k}^{m}),
λ^i,nm\displaystyle\widehat{\lambda}_{i,n}^{\prime m} :=k=ni+2nλ^km(=kn1(1δ^m)).\displaystyle:=\sum_{k=n-i+2}^{n}\widehat{\lambda}_{k}^{m}\left(\prod_{\ell=k}^{n-1}(1-\widehat{\delta}_{\ell}^{m})\right).

This provides a more efficient algorithm.

4.2 The Negative binomial case

We extend the Poisson model to fit the Negative binomial framework. To account for the variability in the parameters rr’s, δ\delta’s and p1p_{1}, for 1mM1\leq m\leq M. We have the following lemma, similar to Lemma 4.1.

Lemma 4.3.

Under H1 and H2", we have:

λ^jj1\displaystyle\widehat{\lambda}_{j}\mid\mathcal{B}_{j-1} 𝒩(rji=1nj+1Ei,pj)i=1nj+1Ei,\displaystyle\sim\frac{\mathcal{NB}\left(r_{j}\sum_{i=1}^{n-j+1}E_{i},p_{j}\right)}{\sum_{i=1}^{n-j+1}E_{i}},\ \ 1jn,\displaystyle 1\leq j\leq n,
δ^jj\displaystyle\widehat{\delta}_{j}\mid\mathcal{B}_{j} (i=1njCi,j,δj)i=1njCi,j,\displaystyle\sim\frac{\mathcal{B}\left(\sum_{i=1}^{n-j}C_{i,j},\delta_{j}\right)}{\sum_{i=1}^{n-j}C_{i,j}},\ \ 1jn1,\displaystyle 1\leq j\leq n-1,

where k:=σ(Ni,j,Di,ji+jn+1,jk)\mathcal{B}_{k}:=\sigma(N_{i,j},D_{i,j}\mid i+j\leq n+1,j\leq k).

Proof.

Direct consequence of H1 and H2". ∎

However, unlike the Poisson case, we cannot apply the above lemma straightforwardly since it does not provide the distribution of p1p_{1}. Additionally, the estimator of p1p_{1}, defined in (5), is non-trivially dependent on the Ni,jN_{i,j}.

Nonetheless, for the δs\delta^{\prime}s, we can proceed as follows:

(δ^jm)1mM\displaystyle(\widehat{\delta}_{j}^{m})_{1\leq m\leq M} i.i.d.(i=1njCi,j,δ^j)i=1njCi,j,\displaystyle\overset{i.i.d.}{\sim}\frac{\mathcal{B}\left(\sum_{i=1}^{n-j}C_{i,j},\widehat{\delta}_{j}\right)}{\sum_{i=1}^{n-j}C_{i,j}},\quad 1jn1.\displaystyle 1\leq j\leq n-1.

For the rr’s and p1p_{1}, we resimulate the upper triangle Ni,jN_{i,j}.

Ni,jm\displaystyle N_{i,j}^{m} 𝒩(r^jEi,p^j),\displaystyle\sim\mathcal{NB}\left(\widehat{r}_{j}E_{i},\widehat{p}_{j}\right),\quad 1in,1jni.\displaystyle 1\leq i\leq n,\quad 1\leq j\leq n-i.

From these upper triangles, we can estimate the rr’s and p1p_{1}. These are denoted respectively as (r^jm)1jn(\widehat{r}_{j}^{m})_{1\leq j\leq n} and p^1m\widehat{p}_{1}^{m}.

Bootstrap simulation of Cn+1,nC_{n+1,n}.

Following Proposition 3.7, the simulation is straightforward:

Cn+1,nm𝒩(r^nmEn+1,p^nm), 1mM.C_{n+1,n}^{m}\sim\mathcal{NB}(\widehat{r}_{n}^{{}^{\prime}m}E_{n+1},\widehat{p}_{n}^{m}),\ \ 1\leq m\leq M.

Bootstrap simulation of Ci,jnC_{i,j}\mid\mathcal{F}_{n}.

For the lower triangle, we simulate :

Ci,ni+1m\displaystyle C_{i,n-i+1}^{m} :=Ci,ni+1,\displaystyle:=C_{i,n-i+1},\ \ 1in,\displaystyle 1\leq i\leq n,
Ci,j+1m\displaystyle C_{i,j+1}^{m} :=Ci,jm+𝒩(r^j+1mEi,p^j+1m)(Ci,jm,δ^jm),\displaystyle:=C_{i,j}^{m}+\mathcal{NB}\left(\widehat{r}_{j+1}^{m}E_{i},\widehat{p}_{j+1}^{m}\right)-\mathcal{B}\left(C_{i,j}^{m},\widehat{\delta}_{j}^{m}\right),\ \ ni+1jn1.\displaystyle n-i+1\leq j\leq n-1.

On the right-hand side of the last equality, the difference of the two distributions should be interpreted as the difference of two independent random variables. Similarly to the approach discussed in Remark 4.2 for the Poisson case, if our focus is solely on the distribution Ci,nnC_{i,n}\mid\mathcal{F}_{n}, we can bypass simulating the entire lower triangle using Proposition 3.8.

5 Example

We present two triangles of simulated data to illustrate both cases with n=6n=6 years of observations. For the first one, the exposure and the CC triangle are:

ii EiE_{i} Ci,1C_{i,1} Ci,2C_{i,2} Ci,3C_{i,3} Ci,4C_{i,4} Ci,5C_{i,5} Ci,6C_{i,6}
1 20 5 9 11 12 13 11
2 25 11 16 13 11 17
3 32 9 17 22 22
4 38 10 10 11
5 42 17 18
6 45 14

whose decomposition in NN and DD is:

ii Ni,1N_{i,1} Ni,2N_{i,2} Ni,3N_{i,3} Ni,4N_{i,4} Ni,5N_{i,5} Ni,6N_{i,6}
1 5 4 5 2 1 0
2 11 9 4 4 6
3 9 14 9 3
4 10 7 5
5 17 10
6 14
ii Di,1D_{i,1} Di,2D_{i,2} Di,3D_{i,3} Di,4D_{i,4} Di,5D_{i,5} Di,6D_{i,6}
1 0 0 3 1 0 2
2 0 4 7 6 0
3 0 6 4 3
4 0 7 4
5 0 9
6 0

We can derive directly the λ\lambda’s and δ\delta’s.

jj 1 2 3 4 5 6
λ^j\widehat{\lambda}_{j} 0.327 0.28 0.2 0.117 0.156 0
δ^j\widehat{\delta}_{j} 0.5 0.346 0.217 0 0.154

Let En+1=50E_{n+1}=50 be the exposure for the upcoming year. Under the Poisson assumption, using the estimator of the intensity leads to:

Cn+1,n𝒫(27.752)C_{n+1,n}\sim\mathcal{P}\left(27.752\right)

Under the Negative binomial assumption and utilizing the λ\lambda’s and the δ\delta’s, and computing p1p_{1} by maximum likelihood leads to optimal p11p_{1}\rightarrow 1. In this case, the Negative binomial distribution converges to the Poisson distribution: the assumption does not appear suitable.

To account for the uncertainty in the unknown parameters, we can use the bootstrap procedure. We get that the variance of Cn+1,nC_{n+1,n} is now around 53.361, which is notably higher than the variance obtained when using the Poisson distribution with the estimated parameter. Figure 1 illustrates the histogram of Cn+1,nC_{n+1,n} with the distribution 𝒫(27.752)\mathcal{P}\left(27.752\right) (lighter on the left) compared to the distribution obtained from the bootstrap procedure (darker on the right).

Refer to caption
Figure 1: Comparison of the distribution of the estimated distribution 𝒫(27.752)\mathcal{P}\left(27.752\right) (on the left), and the associated bootstrap distribution (on the right); with M=107M=10^{7} simulations.

We now introduce a second set of triangles, the exposure and the CC triangle are:

ii EiE_{i} Ci,1C_{i,1} Ci,2C_{i,2} Ci,3C_{i,3} Ci,4C_{i,4} Ci,5C_{i,5} Ci,6C_{i,6}
1 20 8 4 12 12 14 13
2 25 3 5 7 10 15
3 32 5 10 11 9
4 38 27 20 29
5 42 23 18
6 45 14

whose decomposition in NN and DD is:

ii Ni,1N_{i,1} Ni,2N_{i,2} Ni,3N_{i,3} Ni,4N_{i,4} Ni,5N_{i,5} Ni,6N_{i,6}
1 8 3 9 4 3 0
2 3 5 4 3 6
3 5 7 3 3
4 27 8 13
5 23 7
6 14
ii Di,1D_{i,1} Di,2D_{i,2} Di,3D_{i,3} Di,4D_{i,4} Di,5D_{i,5} Di,6D_{i,6}
1 0 7 1 4 1 1
2 0 3 2 0 1
3 0 2 2 5
4 0 15 4
5 0 12
6 0

Again, we can derive directly the λ\lambda’s and δ\delta’s.

jj 1 2 3 4 5 6
λ^j\widehat{\lambda}_{j} 0.396 0.191 0.252 0.13 0.2 0
δ^j\widehat{\delta}_{j} 0.591 0.231 0.3 0.091 0.071

Given an exposure of En+1=50E_{n+1}=50 for the next year and assuming a Poisson distribution, we get that

Cn+1,n𝒫(30.243)C_{n+1,n}\sim\mathcal{P}\left(30.243\right)

Computing p1p_{1} by maximum likelihood does not lead to p11p_{1}\rightarrow 1 anymore. We find p^1=0.397\widehat{p}_{1}=0.397. This implies that

jj 1 2 3 4 5 6
p^j\widehat{p}_{j} 0.397 0.616 0.676 0.749 0.767 0.78
r^j\widehat{r}_{j} 0.263 0.402 0.418 0.448 0.328 0.177

In particular, under the Negative binomial assumption,

Cn+1,n𝒩(106.94,0.780).C_{n+1,n}\sim\mathcal{NB}\left(106.94,0.780\right).

By construction, the expected value of Cn+1,nC_{n+1,n} remains at 30.243, but the variance increases to 38.796.

To choose between the two assumptions, we can calculate the log-likelihood and AIC for both cases. Table 1 presents the results.

log-L. AIC
𝒫\mathcal{P} -53.937 119.875
𝒩\mathcal{NB} -50.793 115.586
Table 1: Log-likelihood and AIC for both assumptions.

It appears that using the Negative binomial distribution is the most suitable choice in this scenario. Given that the δ\delta’s are also part of the definition of the pp’s, one might question whether it would be beneficial to estimate both (p1,(δj)1jn1)(p_{1},(\delta_{j})_{1\leq j\leq n-1}) simultaneously using data from both NN and DD. Let (p~1,(δ~j)1jn1)(\widetilde{p}_{1},(\widetilde{\delta}_{j})_{1\leq j\leq n-1}) denote the new estimators. Table 2 provides a comparison, which shows that the difference is minimal.

jj 1 2 3 4 5
δ^j\widehat{\delta}_{j} 0.591 0.231 0.3 0.091 0.071
δ~j\widetilde{\delta}_{j} 0.601 0.232 0.305 0.092 0.071
jj 1 2 3 4 5 6
p^j\widehat{p}_{j} 0.397 0.616 0.676 0.749 0.767 0.78
p~j\widetilde{p}_{j} 0.393 0.619 0.679 0.753 0.77 0.783
Table 2: Comparison of the two estimation methods.

Figure 2 shows the distribution from the Poisson assumption (lighter on the left) compared with the distribution from the negative binomial assumption (darker on the right).

Refer to caption
Figure 2: Comparison of the distribution of the estimated distribution 𝒫(30.243)\mathcal{P}\left(30.243\right) (on the left), and the distribution of the estimated distribution 𝒩(106.939,0.780)\mathcal{NB}\left(106.939,0.780\right) (on the right).

Figure 3 illustrates the bootstrap distribution from the Poisson assumption (lighter on the left) and from the Binomial negative assumption (darker on the right). In each simulation using the negative binomial approach, if the estimated p^1\widehat{p}_{1} was close to 1, suggesting that the Poisson distribution was a better fit, the simulation was conducted using the Poisson framework. The bootstrap results show that the variance of the distribution under the Poisson assumption is 62.633, while the variance under the negative binomial assumption is 67.658.

Refer to caption
Figure 3: Comparison of the Bootstrap distribution obtained with the Poisson assumption (on the left), and with the Negative binomial assumption (on the right); with M=107M=10^{7} simulations.

Acknowledgments

The author acknowledges the financial support provided by the Fondation Natixis.

References

  • [1] P. D. England and R. J. Verrall. Predictive distributions of outstanding liabilities in general insurance. Annals of Actuarial Science, 1(2):221–270, 2006.
  • [2] Huijuan Liu and Richard Verrall. A bootstrap estimate of the predictive distribution of outstanding claims for the schnieper model. ASTIN Bulletin, 39(2):677–689, 2009.
  • [3] Huijuan Liu and Richard Verrall. Predictive distributions for reserves which separate true ibnr and ibner claims. ASTIN Bulletin, 39(1):35–60, 2009.
  • [4] Thomas Mack. Distribution-free calculation of the standard error of chain ladder reserve estimates. ASTIN Bulletin, 23(2):213–225, 1993.
  • [5] Esbjörn Ohlsson. Using separate exposure for ibnyr and ibner in the chain ladder method, 2015.
  • [6] Esbjörn Ohlsson and Björn Wallberg-Beutelrock. Claims reserving using separate exposure for claims with and without a case reserve, 2022.
  • [7] R. Schnieper. Separating true ibnr and ibner claims. ASTIN Bulletin, 21(1):111–127, 1991.