A debt behaviour model

Wenjun Zhang, John Holt

Refer to caption — Figure 1: This diagram depicts the underlying causal structure of the model. See the text for the definitions of D,Y,B,T,S.

The model concerns the following random variables:

•

A discrete Markov process $B_{t}$ which records the behavioural state of the debtor during the time period $t$ - measured in months. The state is measured in the middle of each month.
•

A discrete-valued process $T_{t}$ which records the strongest debt management intervention that was applied to the debtor during the time period $t$ .
•

$R$ an entity-specific variable, $R$ gives the final result of the debtor’s most immediate previous debt case - NA, paid in full, liquidation/bankrupty, full write-off, partial write-off.
•

$X_{t}$ is the economic state at time period $t$ . This measure is obtained through clustering a pertinent collection of economic variables: change in CPI, change in unemployment, change in the average weekly wage, etc. The underlying variables for $X_{t}$ are varying quarterly, so $X_{t}$ will be constant in blocks of three months.
•

$S_{t}$ is a latent discrete Markov process which categorizes debtors in a time period into the behavioural scheme that governs the generation of $B_{t}$ . The model supposes that $T_{t-1}$ influences $S_{t}$ , and hence influences $B_{t}$ indirectly.

•

$D_{t}$ is a positive real-valued variable, given by

D_{t}=\frac{\text{Debt amount at time $t$, including penalties and interest}}{\text{Largest amount of debt owed up to time $t$, excluding penalties and interest}}

•

$Y_{t}$ is a categorization of $D_{t}$ into $\{0,1\}$ - this is governed by a parameter $\alpha$ that needs to be inferred. the notion is that as a debtor gets closer to being paid in full, its probability of making a large lump-sum payment to clear its debt may change.

We introduce a set of parameters as follows:

•

$\alpha$ : defined by $Y_{t}:=0$ if and only if $D_{t}\leq\alpha$ .
•

$Q_{S}$ : a list of transition matrices, one for each combination of values of $R,X_{t},T_{t-1}$ .
•

$\pi_{S}$ : a list of initial probabilities, one for each combination of values of $R,X_{t}$ .
•

$Q_{B}$ : a list of transition matrices, one for each combination of values of $Y_{t-1}$ and $S_{t}$ .
•

$\pi_{B}$ : a list of initial probabilities, one for each value of $S_{1}$ .

Figure 2 depicts the causal structure of the variables and the parameters - we have now expressed each of the variables as a vector of length as long as the number of observation periods.

Every debt case begins at a time period $u$ and ends at a time period $l$ . If the debt case is indexed by $i$ , the the beginning is $u_{i}$ and the end is $l_{i}$ . There will be observations of $T_{t}$ , $B_{t}$ , $D_{t}$ , and $X_{t}$ from $u_{i}$ through to $l_{i}$ .

The log-likelihood of observing a single debt case is maximized when we maximize:

l_{0}=\sum_{t=u+1}^{t=l}(\ln(Q_{B}^{Y_{t-1},S_{t}}(B_{t-1},B_{t}))+\ln(Q_{S}^{X_{t},R,T_{t-1}}(S_{t-1},S_{t})))+\ln(\pi^{S_{u}}_{B}(B_{u}))+\ln(\pi_{S}^{X_{u},R}(S_{u}))

We apply the EM algorithm to $l_{0}$ , taking the expected value of $l_{0}$ conditional on $\{B_{t},X_{t},D_{t},T_{t},R\}$ and the $k$ -th iteration of the parameters $\{\alpha,Q_{B},Q_{S},\pi_{B},\pi_{S}\}$ , $\Theta^{k}$ .

For this we define the responsibilities for each debt case, $i$ , and time $t$ , $t=u_{i},\ldots,l_{i}$ :

\gamma_{i,t}(s):=p(S_{t}=s|T_{u_{i}}^{l_{i}-1},X_{u_{i}}^{l_{i}},B_{u_{i}}^{l_{i}},R_{i},D_{u_{i}}^{l_{i}-1})

for $t\geq u_{i}$ ; and for $t>u_{i}$ ,

\Gamma_{i,t}(p,q):=p(S_{t}=q,S_{t-1}=p|T_{u_{i}}^{l_{i}-1},B_{u_{i}}^{l_{i}},R_{i},D_{u_{i}}^{l_{i}-1})

It is clear that $\gamma_{i,t}(s)=\sum_{p}\Gamma_{i,t}(p,s)$ , or if $t=u_{i}$ , $\gamma_{i,u_{i}}(s)=\sum_{q}\Gamma_{i,u_{i}+1}(s,q)$ - hence we need only compute $\Gamma_{i,t}$ .

This is done using the Forward-Backward algorithm:

1 Calculating $\Gamma_{i,t}$

This calculation is standard, but we present it for completeness.

Define the following four sets of probabilities:

•

$\pi_{t}(s)=p(S_{t}=s|T_{u}^{l-1},X_{u}^{l},R,D_{u}^{l-1},B_{u}^{l})$
•

$\pi_{t}^{\prime}(s)=p(S_{t}=s|T_{u}^{t-1},X_{u}^{t},R,D_{u}^{t-1},B_{u}^{t})$ , $t\geq u$ .
•

$F_{t}(p,q)=p(S_{t-1}=p,S_{t}=q|T_{u}^{t-1},X_{u}^{t},R,D_{u}^{t-1},B_{u}^{t})$ , $t>u$
•

$\Gamma_{t}(p,q)=p(S_{t-1}=p,S_{t}=q|T_{u}^{l-1},X_{u}^{l},R,D_{u}^{l-1},B_{u}^{l})$ , $t>u$ .

Then

	$\displaystyle F_{t}(p,q)$	$\displaystyle\propto$	$\displaystyle Q_{B}^{q,Y_{t-1}}(B_{t-1},B_{t})Q_{S}^{T_{t-1},X_{t},R}(p,q)\pi_{t-1}^{\prime}$
		$\displaystyle=$	$\displaystyle(Q_{B}^{q,0}(B_{t-1},B_{t})I_{[0,\alpha]}(D_{t-1})+Q_{B}^{q,1}(B_{t-1},B_{t})I_{(\alpha,\infty)}(D_{t-1}))Q_{S}^{T_{t-1},X_{t},R}(p,q)$

and

\pi_{t}^{\prime}(q)=\sum_{p}F_{t}(p,q)

with $\pi_{u}^{\prime}(s)\propto\pi_{B}^{s}(B_{u})\pi_{S}^{X_{u},R}(s)$ . The normalizing constants can be found by noting that $\sum_{p,q}F_{t}(p,q)=1$ and $\sum_{s}\pi_{u}^{\prime}(s)=1$ .

Having obtained $F_{t}(p,q)$ (the forward matrices) we can calculate the backward matrices $\Gamma_{t}$ as follows:

Set $\Gamma_{l}=F_{l}$ .

For $t<l$ ,

$\displaystyle\Gamma_{t}(p,q)$	$\displaystyle=$	$\displaystyle p(S_{t-1}=p\|S_{t}=q,T_{u}^{l-1},X_{u}^{l},R,D_{u}^{l-1},B_{u}^{l})p(S_{t}=q\|T_{u}^{l-1},X_{u}^{l},R,D_{u}^{l-1},B_{u}^{l})$
	$\displaystyle=$	$\displaystyle p(S_{t-1}=p\|S_{t}=q,T_{u}^{t-1},X_{u}^{t},R,D_{u}^{t-1},B_{u}^{t})\pi_{t}(q)$
	$\displaystyle=$	$\displaystyle F_{t}(p,q)\frac{\pi_{t}(q)}{\pi_{t}^{\prime}(q)}$

2 Update equations for the M-step

The formulas that follow are the result of straightforward calculations.

$\displaystyle Q_{B}^{s,y}(b,c)$	$\displaystyle=$	$\displaystyle\frac{\sum_{i}\sum_{t=u_{i}+1}^{l_{i}}\delta(B_{i,t}-c)\delta(B_{i,t-1}-b)\delta(Y_{i,t-1}-y)\gamma_{i,t}(s)}{\sum_{i}\sum_{t=u_{i}+1}^{l_{i}}\delta(B_{i,t-1}-b)\delta(Y_{i,t-1}-y)\gamma_{i,t}(s)}$
$\displaystyle\pi_{B}^{s}(b)$	$\displaystyle=$	$\displaystyle\frac{\sum_{i}\delta(B_{i,u_{i}}-b)\gamma_{i,u_{i}}(s)}{\sum_{i}\gamma_{i,u_{i}}(s)}$
$\displaystyle Q_{S}^{T,R,X}(p,q)$	$\displaystyle=$	$\displaystyle\frac{\sum_{i}\sum_{t=u_{i}}^{l_{i}-1}\delta(T_{i,t}-T)\delta(R_{i}-R)\delta(X_{t}-X)\gamma_{i,t}(p)\gamma_{i,t+1}(q)}{\sum_{i}\sum_{t=u_{i}}^{l_{i}-1}\delta(T_{i,t}-T)\delta(X_{t}-X)\delta(R_{i}-R)\gamma_{i,t}(p)}$
$\displaystyle\pi_{S}^{R,X}(s)$	$\displaystyle=$	$\displaystyle\frac{\sum_{i}\delta(R_{i}-R)\delta(X_{u_{i}}-X)\gamma_{i,u_{i}}(s)}{\sum_{i}\delta(R_{i}-R)\delta(X_{u_{i}}-X)}$

Note that $Q_{B}$ depends on an unknown value of $\alpha$ . The approach will be to fit $Q_{B}$ for a range of values of $\alpha$ , and to choose the $\alpha$ that gives the maximum value to:

l_{1}=\sum_{i}\sum_{t=u_{i}+1}^{l_{i}}\sum_{s}\ln(Q_{B}^{s,0}(B_{i,t-1},B_{i,t})I_{[0,\alpha]}(D_{i,t-1})+Q_{B}^{s,1}(B_{i,t-1},B_{i,t})I_{(\alpha,\infty)}(D_{i,t-1}))\gamma_{i,t}(s)

A debt behaviour model

1 Calculating Γi,t\Gamma_{i,t}

2 Update equations for the M-step

1 Calculating $\Gamma_{i,t}$