This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Mathematical Models for Describing and Predicting the COVID-19 Pandemic Crisis

Cintra, P. H. P.
Instituto de Física
Universidade de Brasilia
Brasilia, DF, Brasil, 70910-900
pedrohpc96@hotmail.com
&Citeli, M. F.
Instituto de Física
Universidade de Brasilia
Brasilia, DF, Brasil, 70910-900
miguelciteli@gmail.com
&Fontinele, F. N.
Instituto de Física
Universidade de Brasilia
Brasilia, DF, Brasil, 70910-900
feradofogo@hotmail.com
Use footnote for providing further information about author (webpage, alternative address)—not for acknowledging funding agencies.
Abstract

The present article studies the extension of two deterministic models for describing the novel coronavirus pandemic crisis, the SIR model and the SEIR model. The models were studied and compared to real data in order to support the validity of each description and extract important information regarding the pandemic, such as the basic reproductive number 0\mathcal{R}_{0}, which might provide useful information concerning the rate of increase of the pandemic predicted by each model. We next proceed to making predictions and comparing more complex models derived from the SEIR model with the SIRD model, in order to find the most suitable one for describing and predicting the pandemic crisis. Aiming to answer the question if the simple SIRD model is able to make reliable predictions and deliver suitable information compared to more complex models.

Keywords COVID-19  \cdot coronavirus  \cdot SEIR model  \cdot SIR model  \cdot epidemic model

1 Introduction

Back on 2015, a group of researches described the potential for a SARS coronavirus circulating inside bats to mutate to humans [1]. Early on 2020 the world suffers from an new pandemic crisis caused by the novel coronavirus, SARS-CoV-2, belonging to the Betacoronavirus genus and with probable origin on bats [2]. The first cases of the novel virus date back to December 2019 at the Food Market of Wuhan, China [3], where bats are sold among other exotic animals, since then the virus has been spreading throughout China, later Asia, Europe, Africa and America, causing a global scale economic crisis and being notified by the World Health Organization as an pademic on March 11th.

COVID-19 is a respiratory disease caused by the Sars-Cov-2 virus, currently in human-to-human sustained transmission [4]. We know that the contamination mainly occurs by a close interaction with infected individuals, as the viral charge is carried by respiratory droplets that can remain in suspension in air or deposited on surfaces of common contact. As a novel strain of the Coronaviridae family, it is not expected that any individual has antibodies against it, which causes the entire population to be susceptible to infection. As an individual is exposed to the virus, the incubation period begins, with no symptoms and a small chance of contaminating others. When the virus is onset, the infected individual show symptoms in a varied range of intensities and may develop severe acute respiratory syndrome. COVID-19 has a general mortality rate bellow 5% [5], with an average of 2.3%. The behavior of the disease is age dependent, with the higher risk group being older populations, that present a mortality rate of 8% for individuals between 70-79 years and 14.8% for people older than 80 years [6]. However, even with a low mortality rate, the number of hospitalizations is quite high, with 5% of the cases being critical and 14% being severe [7], presenting an challenge to health care systems of some countries.

Mathematical models predicted the potential for an international outbreak early on [8] and described how Wuhan became the center of an epidemic crisis on China. The outbreak quickly spread throughout mainland China and other countries. Although the pandemic crisis began on Wuhan, today the United States of America is the epicenter of the pandemic crisis on the world.

Many models are widely used from scenario prediction [9, 10, 11, 12] to data analysis [13]. Among all those models, the most used ones are the SIR and SEIR models [14], including their modifications to include hospitalizations, asymptomatic cases and other compartments [15]. We develop here a comparative study between modified SIR and SEIR models; in order to provide support for the accuracy of both models, we compare the differences on the fitting process between the simple SIRD and simple SEIRD models and the accuracy of prediction generated by the simple SIRD and the SEIRD model with age division, using data from Germany and the Republic of Korea. The choice was based on the accuracy of the data for representing the true scale and dynamics of the pandemic, other countries who presented a much lower testing rate such as Brazil [16] are not reliable sources for testing models describing the disease. There are also countries such as Taiwan and Iceland, which kept track of the disease; however the number of cases in those countries were much lower, escaping the deterministic nature of the models described here.

2 Theory

Mathematical models for disease epidemic are either deterministic or stochastic [17], where the first may be considered some sort of thermodynamic limit of the second. An analogy made with thermodynamics, where given a big enough number of particles in a gas, for example, you find deterministic equations to describe the behavior of the gas given by the laws of thermodynamics without the need to know the exact behavior of each particle. Otherwise, when your number of molecules is low or you try to compute too many interactions between particles of your system, the random behavior and fluctuations start taking place and you get to a stochastic model.

We describe here a simple extensions of models constantly used on literature [18, 19, 10] and with it show some possible behaviors of a disease outbreak.

2.1 SIRD

A simple mathematical model for disease epidemic can been built dividing the population in 3 groups: susceptible individuals (S), infected individuals (I) and recovered individuals (R). The model composed by these groups is called the SIR model. In this article, however, we consider also individuals who have died by the disease, denote by D. Following the same arguments of the SIR model, the SIRD model can be described by the set of four differential equations:

dSdt=βNI(t)S(t)\displaystyle\frac{dS}{dt}=-\frac{\beta}{N}I(t)S(t) (1)
dIdt=βNI(t)S(t)(γ+μ)I(t)\displaystyle\frac{dI}{dt}=\frac{\beta}{N}I(t)S(t)-(\gamma+\mu)I(t) (2)
dRdt=γI(t)\displaystyle\frac{dR}{dt}=\gamma I(t) (3)
dDdt=μI(t)\displaystyle\frac{dD}{dt}=\mu I(t) (4)

Last equation is easily understood by thinking that the variation of the number of deaths may be proportional to the infected individuals, where the proportionality constant is denoted by μ\mu. The constants γ\gamma and β\beta are, respectively, the recovery rate and the number of infected, where μ\mu and γ\gamma are given in terms of the infection fatality rate (IFR) or the case fatality rate (CFR); that is, the number of people who contracted the disease and died according to the total number of infections (IFR) or the registered number of infections (CFR), and the average time taken from symptoms onset to recovery, τr\tau_{r}, or death τd\tau_{d}, formally μ=PCFR/τd\mu=P_{CFR}/\tau_{d} and γ=(1PCFR)/τr\gamma=(1-P_{CFR})/\tau_{r}. The equations are simply a mathematical way to describe how individuals passes from one group to the other according to the following chain of events: A susceptible individual becomes infected by the virus, and from this point, it either dies or recovers (Figure 1).

Susceptible S(t)Infected I(t)Recovered R(t)Dead D(t) βI(t)S(t)\beta I(t)S(t) γI(t)\gamma I(t) μI(t)\mu I(t)
Figure 1: Representation of a SIRD model, a susceptible person gets infected and either dies or recovers from the disease.

Summing the four equations we get

S(t)+I(t)+R(t)+D(t)=const,\displaystyle S(t)+I(t)+R(t)+D(t)=\text{const}, (5)

where the constant may represent the total number of individuals, NN. Before proceeding, we propose the initial condition that, when tt goes to zero, I(t)=I0I(t)=I_{0}, R(t)=D(t)=0R(t)=D(t)=0, and, therefore, S(t)=S0=NI0NS(t)=S_{0}=N-I_{0}\approx N. Such an assumption is based on the fact that the entire population is susceptible to the SARS-CoV-2 virus.

Since R(t)R(t) and D(t)D(t) are both data updated day by day in Germany and Korea, it would be helpful to write I(t)I(t) as function of them so as to predict its behavior, obtaining, for example, the maximum number of infected individuals. For reasons that may be clear soon, we first write I(t)I(t) in terms of S(t)S(t). An intuitive step is to divide equation (2) by (1). Thus,

dI/dtdS/dt=1+k1S,\displaystyle\frac{dI/dt}{dS/dt}=-1+k\frac{1}{S}, (6)

where k=(γ+μ)/βk=(\gamma+\mu)/\beta. Eliminating the temporal dependence, we get a separable differential equation, that is,

dI=dS+kdSS,\displaystyle dI=-dS+k\frac{dS}{S}, (7)

which the solution is easily verified to be

I(t)=S(t)+klnS(t)+const.\displaystyle I(t)=-S(t)+k\ln{S(t)}+\text{const}. (8)

Applying the initial condition, we obtain

I0=(NI0)+kln(NI0)+const\displaystyle I_{0}=-(N-I_{0})+k\ln(N-I_{0})+\text{const} (9)
const=Nkln(NI0).\displaystyle\rightarrow\text{const}=N-k\ln(N-I_{0}).

Hence, equation (8) may be written as

I(t)=NS(t)+klnS(t)NI0.I(t)=N-S(t)+k\ln\frac{S(t)}{N-I_{0}}. (10)

We can visualize here, that depending on the combination of γ\gamma and μ\mu, II reaches 0 before the entire population SS becomes infected (Figure 2).

Refer to caption
Figure 2: Plot of equation (10) with different conbinations of γ\gamma and μ\mu.

Next, we may write S(t)S(t) in terms of R(t)R(t) and D(t)D(t). For this purpose, we begin by dividing equation (1) by (3) and (1) by (4),

dSdR=βγS(t)and\displaystyle\frac{dS}{dR}=-\frac{\beta}{\gamma}S(t)\,\,\,\text{and} (11)
dSdD=βμS(t).\displaystyle\frac{dS}{dD}=-\frac{\beta}{\mu}S(t). (12)

Adding these two equations and writing S(R,D)S(R,D) as S(R,D)=f(R)g(D)S(R,D)=f(R)g(D), we get

1fdfdR+1gdgdD=β(1γ+1μ)\displaystyle\frac{1}{f}\frac{df}{dR}+\frac{1}{g}\frac{dg}{dD}=-\beta\left(\frac{1}{\gamma}+\frac{1}{\mu}\right) (13)

Since (13) is a separable equation, the well-known solution is given by

f(R)=AeaRand\displaystyle f(R)=Ae^{aR}\,\,\,\text{and} (14)
g(D)=BebD.\displaystyle g(D)=Be^{bD}. (15)

Therefore, S(t)S(t) can be written as

S(t)=CeaR(t)+bD(t),\displaystyle S(t)=Ce^{aR(t)\,+\,bD(t)}, (16)

where we absorbed both constants AA and BB into CC. By the initial condition, we find that C=NI0C=N-I_{0}. To find aa and bb, we must derive (16) in time under the condition that it may return to equation (1). In this way, we see that

aγ+bμ=β.\displaystyle a\gamma+b\mu=-\beta. (17)

By the other hand, substituting equations (14) and (15) in (13), we get

a+b=β(1γ+1μ).\displaystyle a+b=-\beta\left(\frac{1}{\gamma}+\frac{1}{\mu}\right). (18)

Solving this system,

a=βγ(1γ/μ)\displaystyle a=-\frac{\beta}{\gamma(1-\gamma/\mu)} (19)
b=βμ(1μ/γ)\displaystyle b=-\frac{\beta}{\mu(1-\mu/\gamma)} (20)

Hence, I(t)I(t) can be finally written as

I(t)=N(NI0)eaR(t)+bD(t)+γμγ+μγμ(R(t)γ2M(t)μ2)\displaystyle I(t)=N-(N-I_{0})e^{aR(t)+bD(t)}+\gamma\mu\frac{\gamma+\mu}{\gamma-\mu}\left(\frac{R(t)}{\gamma^{2}}-\frac{M(t)}{\mu^{2}}\right) (21)

With this equation, see that as tt\rightarrow\infty, II does not approaches NN necessarily, depending on the recovery and death rates, II does not reach NN.

The last important quantity extracted from this model is the basic reproduction number 0\mathcal{R}_{0}, given by [14]:

0=βγ+μ.\displaystyle\mathcal{R}_{0}=\frac{\beta}{\gamma+\mu}. (22)

This quantity, is of vital importance of the study of a disease outbreak.

2.2 SEIRD

Another deterministic mathematical model possible is the SEIRD model, in which we consider the population NN of a given region as divided in 5 groups. At time tt, there are those who are susceptible to get infected S(t)S(t), the ones who have already been exposed the virus but does not present symptoms yet E(t)E(t), people who are already infected and present the symptoms I(t)I(t), the ones that have already recovered from the disease R(t)R(t) and those who are dead due to the infection D(t)D(t). This model is a good approximation to a short epidemic, so the population of a region is roughly constant throughout the epidemic period. Also, since this is a deterministic model, we assume NN to be a big number compared to the number of people associated with the infection of a single person. The final consideration is that we also assume that people that are recovered from the disease acquire immensity and does not become susceptible to become infected again.

The rate of infection λ\lambda is proportional to the number of people infected, λ(t)=βI(t)\lambda(t)=\beta I(t), where the constant β\beta represents the effectiveness of the infection, the rate of cure γ=P:)τr1\gamma=P_{:)}\tau_{r}^{-1}, where P:)P_{:)} is the probability of recovery and τr\tau_{r} is the average time taken for an infected person to recover. Similarly the rate of death is μ=PCFRτd1\mu=P_{CFR}\tau_{d}^{-1}, where PCFR=1P:)P_{CFR}=1-P_{:)} is the probability of death, given by the CFR and τd\tau_{d} is the average time taken for an infected person to die. Figure 3 carries an visual representation of the SEIRD model.

Susceptible S(t)Infected I(t)Recovered R(t)Dead D(t)Exposed E(t) βI(t)S(t)+kE(t)S(t)\beta I(t)S(t)+kE(t)S(t) cE(t)cE(t) γI(t)\gamma I(t) μI(t)\mu I(t)
Figure 3: Representation of a SEIRD model, a susceptible person gets exposed to the virus, being infected afterwards and either dies or recovers from the disease.

The differential equations representing the evolution of the populations are given by

dSdt=(1Pexp)βNI(t)S(t)PexpβNE(t)S(t)\displaystyle\frac{dS}{dt}=-\frac{(1-P_{exp})\beta}{N}I(t)S(t)-\frac{P_{exp}\beta}{N}E(t)S(t) (23)
dEdt=(1Pexp)βNI(t)S(t)+PexpβNE(t)S(t)cE(t)\displaystyle\frac{dE}{dt}=\frac{(1-P_{exp})\beta}{N}I(t)S(t)+\frac{P_{exp}\beta}{N}E(t)S(t)-cE(t) (24)
dIdt=cE(t)γI(t)μI(t)\displaystyle\frac{dI}{dt}=cE(t)-\gamma I(t)-\mu I(t) (25)
dRdt=γI(t)\displaystyle\frac{dR}{dt}=\gamma I(t) (26)
dDdt=μI(t)\displaystyle\frac{dD}{dt}=\mu I(t) (27)

We first turn our attention to the construction of an appropriate formula for calculating 0\mathcal{R}_{0} with this model. For that we follow the method derived on [20]. The study develops a mathematical generalization for writing 0\mathcal{R}_{0} depending on the type of epidemiological model. 0\mathcal{R}_{0} is defined as

0=ρ(FV1)\displaystyle\mathcal{R}_{0}=\rho(FV^{-1}) (28)

where ρ(X)\rho(X) means the spectral radius of the matrix X, that is, the largest absolute eigenvalue. Both FF and VV are the matrices of the derivatives of the functions defining the behavior of the disease population, with respect to each population compartment.

To get to these matrices, we first note that the set of equations regarding the dynamics of the SEIRD model can be expressed as follow: Consider x\vec{x} the vector of populations, that is x=(x1,x2,x3,x4,x5)\vec{x}=(x_{1},x_{2},x_{3},x_{4},x_{5}) where x1=Ex_{1}=E, x2=Ix_{2}=I, x3=Sx_{3}=S, x4=Rx_{4}=R and x5=Dx_{5}=D. Analogously, dx/dtd\vec{x}/dt is the vector of the first derivatives. Then, we can write the dynamics of the populations as

dxdt=𝒱\displaystyle\frac{d\vec{x}}{dt}=\mathcal{F}-\mathcal{V} (29)

where \mathcal{F} is the vector that relates the appearance of new infections on the disease populations due to contamination, and 𝒱\mathcal{V} is the input and output of members in all populations due to all other causes, such as recovery from the disease, development of symptoms after an incubation period, etc. In our case, since all newly infected members go to the EE population

=((1Pexp)βIS+PexpβES0000)\displaystyle\mathcal{F}=\begin{pmatrix}(1-P_{exp})\beta IS+P_{exp}\beta ES\\ 0\\ 0\\ 0\\ 0\end{pmatrix} (30)

while

𝒱=(βcEcE+γI+μI(1Pexp)βIS+PexpβESγIμI).\displaystyle\mathcal{V}=\begin{pmatrix}\beta cE\\ -cE+\gamma I+\mu I\\ (1-P_{exp})\beta IS+P_{exp}\beta ES\\ -\gamma I\\ -\mu I\end{pmatrix}. (31)

Now, we know that the situation of a disease free equilibrium (DFE), meaning no disease is happening, is achived by the vector x0=(0,0,S0,0,0)\vec{x}_{0}=(0,0,S_{0},0,0), where S0=NS_{0}=N. According now to [20] we can calculate FF and VV as

F=(ixj)|x=x01i\displaystyle F=\left.\left(\frac{\partial\mathcal{F}_{i}}{\partial x_{j}}\right)\right|_{x=x_{0}}\hskip 28.45274pt1\leq i (32)
V=(𝒱ixj)|x=x0jm\displaystyle V=\left.\left(\frac{\partial\mathcal{V}_{i}}{\partial x_{j}}\right)\right|_{x=x_{0}}\hskip 28.45274ptj\leq m (33)

being xjx_{j} the vector components of x\vec{x} related to the populations with the disease, in our case EE and II, and mm is the number of populations related to infectious beings. Here m=2m=2. Performing the derivatives, we conclude

F=(Pexpβ(1Pexp)β00)\displaystyle F=\begin{pmatrix}P_{exp}\beta&(1-P_{exp})\beta\\ 0&0\end{pmatrix} (35)
V=(c0cγ+μ)\displaystyle V=\begin{pmatrix}c&0\\ -c&\gamma+\mu\end{pmatrix} (36)

The next step is to find the inverse matrix of VV, fortunately VV is a 2x2 matrix and the formula for it’s inverse is straightforward

V1=1c(γ+μ)(γ+μ0cc),\displaystyle V^{-1}=\frac{1}{c(\gamma+\mu)}\begin{pmatrix}\gamma+\mu&0\\ c&c\end{pmatrix}, (37)

and we proceed to the last step of combining FV1FV^{-1} in order to retrieve ρ(FV1)\rho(FV^{-1}) and find 0\mathcal{R}_{0}.

FV1=1c(γ+μ)×\displaystyle FV^{-1}=\frac{1}{c(\gamma+\mu)}\times (38)
×(Pexpβ(γ+μ)+(1Pexp)βc(1Pexp)βc00),\displaystyle\times\begin{pmatrix}P_{exp}\beta(\gamma+\mu)+(1-P_{exp})\beta c&(1-P_{exp})\beta c\\ 0&0\end{pmatrix},

therefore, by computing the eingenvalues of FV1FV^{-1} we find

0=Pexpβ[γ+(1Pexp)β]+(1Pexp)βcc(γ+μ)\displaystyle\mathcal{R}_{0}=\frac{P_{exp}\beta\left[\gamma+(1-P_{exp})\beta\right]+(1-P_{exp})\beta c}{c(\gamma+\mu)} (39)

Having 0\mathcal{R}_{0} in our hands, we continue to the study of some behaviors of this model.

The set of equations describing the model is subjected to the initial conditions. When t0t\rightarrow 0, I(t)I0I(t)-\rightarrow I_{0}, S(t)S=NI0S(t)\rightarrow S=N-I_{0}, R(t)0R(t)\rightarrow 0, D(t)0D(t)\rightarrow 0 and E(t)E0E(t)\rightarrow E_{0}, where I0I_{0} is the initial number of infected, E0E_{0} is the initial number of exposed in the population and no deaths or recoveries are assumed at t=0t=0.

2.3 Non-pharmaceutical intervention

Without vaccines or efficient medicine against the disease, non-pharmaceutical interventions are the only effective way to prevent further increase of the pandemic [13]. These interventions take different approaches such as social distancing, social isolation and lockdown of the population. Despite the differences, they all carry the same objective, decreasing the infection rate β\beta. It is convenient to implement the effect of these interventions on the model, when making predictions. Here, we model this effect by a logistic function, where β\beta starts at a initial value βi\beta_{i} and at some critical time tct_{c} a intervention is imposed and beta decreases to βf=Pdecβi\beta_{f}=P_{dec}\beta_{i}, where PdecP_{dec} is the fraction of βi\beta_{i} decreased by the intervention. In France, studies estimate that the intervention decreased βi\beta_{i} by 77% [21], therefore Pdec=0.77P_{dec}=0.77 in France.

β(t)=(1Pdec)βi1+τettc+Pdecβi\displaystyle\beta(t)=\frac{(1-P_{dec})\beta_{i}}{1+\tau e^{t-t_{c}}}+P_{dec}\beta_{i} (40)

where τ\tau is a constant related to the time taken for the intervention to have the effect desired. Such model reconstruct the general behavior of interventions against the spread of the disease (Figure 4)

Refer to caption
Figure 4: Visual representation of the effect of non-pharmaceutical interventions on the infection curve, depending on the efficiency of the intervention, given by PdecP_{dec}.

2.4 Age division

Since the case fatality rate (CFR) of COVID-19 is different among age groups [6, 7, 22], we propose here a modification on both models, including the age distribution of the population and the social aspects of close contact between members of the population. The modification is describe as follow: Each compartment is divided into MM age groups, where each ii-th group has a PCFRiP_{CFR_{i}} associated to it, that is, the probability of death associated to the ii-th age group. The β\beta parameter is now described as the average number of daily contacts between a member belonging to the ii-th age group to the jj-th age group, multiplied by the infection probability PinfcP_{infc}

βiI=j=1NCijIjPinfc,\displaystyle\beta_{i_{I}}=\sum_{j=1}^{N}C_{ij}I_{j}P_{infc}, (41)
βiE=j=1NCijEjPinfc,\displaystyle\beta_{i_{E}}=\sum_{j=1}^{N}C_{ij}E_{j}P_{infc}, (42)

where CijC_{ij} is called the social contact matrix and we included IjI_{j} and EjE_{j} inside β\beta now to place everything on the same sum. The age distribution among the population is retrieved from the UN prospects [23] and the social contact matrix for those countries was measured on previous studies [24, 25]. The specific contact matrix for the Republic of Korea was not found, however, [26] finds evidences of cultural clusters in the world, where countries belonging to the same cluster share cultural similarities; thus, we use this fact to justify the use of Hong Kong’s social contact matrix to describe the Republic of Korea. That way, we include cultural and population aspects for each of those countries, increasing the odds of a successful prediction. This type of model was used recently to describe the coronavirus outbreak on large cities in Brazil [27].

3 Comparing Adjustments

To test the SIRD and SEIRD model we first compare them to the pandemic crisis on the Republic of Korea, running a numerical solution for the differential equations (23) - (27) we adjust the general behavior of the populations to Korean data acquired from [28] since 15/02/2020. The data from the Republic of Korea consists of the infection curve and death curve. To prevent problems with initial guess on the fitting process, both models used the same values for the initial guess, except E0E_{0}, which is found only on the SEIRD model. S0=NS_{0}=N was also left as a free parameter of the adjustment instead of set to the total population of the country, which is justified by a limitation in both models, where the population is assumed homogeneously spread, which does not correspond to reality. Thus, NN does not represent the total population, instead it represents an effective population smaller than the total population, due to non-homogeneous distribution throughout the territory, the interpretation of NN as the disease evolves is discussed on the discussion session. The parameter was chosen to be k=0.44βk=0.44\beta, we considered a study which estimated that presyntomatic cases caused 44% of infections [29], while for cc we used an average of several clinical studies shown on table 1.

incubation time 95% confidence Reference
6.4 days 5.6-7.7 [30]
5.2 days 4.1-7 [3]
5 days [31]
4 days [32]
5.1 days 4.5 - 5.8 [33]
Table 1: Incubation time of the disease according to other studies.

Since the Korean government did not impose a lockdown or social isolation, we set Pdec=0P_{dec}=0 in both models. Figures 6 and 6 show the result of the fitting process and table 2 includes the acquired values for all parameters for each model.

Refer to caption
Figure 5: Fit for the infected and deaths by SARS-CoV-2 on the Republic of Korea using the SEIRD model.
Refer to caption
Figure 6: Fit for the infected and deaths by SARS-CoV-2 on the Republic of Korea using the SIRD model.
Parameter SIRD SEIRD
χ2\chi^{2} 0.9978 0.9978
τr\tau_{r} 7.9 ±\pm 0.3 8 ±\pm 0.3
τd\tau_{d} 29.6 ±\pm 0.2 28.8 ±\pm 0.2
I0I_{0} 2 ±\pm 1 1 ±\pm 4
E0E_{0} 62 ±\pm 57
β\beta 0.478 ±\pm 0.004 0.513 ±\pm 0.008
NN 11035 ±\pm 57 11218 ±\pm 48
Table 2: Parameters found by the adjustment with both models

The recovery time on both models is close to 8 days, while other studies such as [34] found 10 days. The time from symptoms onset to death was in both models close to 30 days, being 1 day shorter with the SEIRD model, [35] and [36] found τd=18\tau_{d}=18 or 1111 days.

Comparing the accuracy of the fitting with the data, both models resulted the same value of χ2\chi^{2}. The parameter E0E_{0} presents a large margin of error, which is expected given the lack of real data concerning the exposed population.

Proceeding to the calculation of 0\mathcal{R_{0}} for both models, using equations (39) and (22) we found

0SEIRD=1.92±0.07\displaystyle\mathcal{R_{0}}_{SEIRD}=1.92\pm 0.07 (43)
0SIRD=2.98±0.09\displaystyle\mathcal{R_{0}}_{SIRD}=2.98\pm 0.09 (44)

The value for 0\mathcal{R}_{0} according to other studies ranges from 2 to 3 [37, 38, 39, 40], therefore, both models yield acceptable values for 0\mathcal{R_{0}} being the one predicted by the SEIRD model lower.

4 Prediction Accuracy with Age Division

We now proceed to test the prediction accuracy of both models. We used the first third of the data for fitting both models and extracting parameters, after having the parameters, we compare the prediction for the next days with these parameters with the rest of the dataset.

4.1 Germany

For Germany, the fitting data corresponds to the cases and deaths until 17th March. However, until the peak is reached, both models find presents very large margin of error for NN, to avoid this problem, we varied NN manually, from 0 to 10% of the local population; which was taken from a united nation prospect for the year of 2020 [23], at steps of 0.05%. At each step, we fit the initial data and reject the fitting if the χ2\chi^{2} value is lower than 0.995, this χ2\chi^{2} method for validating the goodness of a data adjustment had already been used for epidemiological models [41]. We than plotted the maximum and minimum acceptable fits to generate the margin of prediction, comparing it with the complete dataset. We also decided to use τr\tau_{r} and τd\tau_{d} according to clinical studies when performing the prediction, instead of leave them as free parameters for the fitting, I0I_{0} was also determined a priori according to the first registered number on 15/02/2020. The resulting free parameters for fitting the training set are PinfcP_{infc} and E0E_{0}.

With the SEIRD model, we found the maximum and minimum values of NN to be Nmin=0.25%N_{min}=0.25\% of the German population and Nmax=0.40%N_{max}=0.40\% of the German population. The PinfcP_{infc} parameter varied from 15.5% to 16.2%

Using the SIRD, leaving β\beta and I0I_{0} as free parameters. The limit values of NN were Nmin=0.15%N_{min}=0.15\% and Nmax=0.2%N_{max}=0.2\%, while β\beta varied from 0.2470.247 to 0.2790.279 and I0I_{0} from 11 to 30. Figures 8 and 8 show the result of both simulations, with the maximum NmaxN_{max} and minimum NminN_{min} curves. The shaded region is the region between NmaxN_{max} and NminN_{min}.

Refer to caption
Figure 7: Prediction for Germany in comparison with real data using the SEIRD model with age division
Refer to caption
Figure 8: Prediction for Germany in comparison with real data using the simple SIRD model.

4.2 Korea

The training set consisted of 20 days, corresponding to the infections from 15/02 to 06/03. The SEIRD model found Nmin=0.03%N_{min}=0.03\% and Nmax=0.04%N_{max}=0.04\% of the total Korean population, while PinfcP_{infc} varied between 80 to 85%.

The simple SIRD model found Nmin=0.02%N_{min}=0.02\% and Nmax=0.03%N_{max}=0.03\%, β\beta went from 0.345 to 0.436, and I0I_{0} was between 23 to 65. Figures 10 and 10 present the result for prediction of both models.

Refer to caption
Figure 9: Prediction for the Republic of Korea in comparison with real data using the SEIRD model with age division
Refer to caption
Figure 10: Prediction for the Republic of Korea in comparison with real data using the simple SIRD model.

5 Discussion

When concerning the adjustment process for acquisition of parameters with both models, there were no difference on the accuracy of the fit, and both models yielded very close values for the parameters. However, τd\tau_{d} is super estimated in both models, being slightly lower on the SEIRD model. The value of τr\tau_{r} is acceptable inside the variation of clinical measures.

The SEIRD model yields a slower growth rate than the SIRD model, that might happen due to the incubation period on the SEIRD model, which slows down the propagation of the virus towards other individuals. The main difference between the growth rate predicted by both models is better visualized by figure 11, the action of the incubation period slows down the rate of infection, as seen by the adjustments, but also decreases the peak of infections. However, the cumulative numbers of infection, deaths and recoveries are the same.

Refer to caption
Figure 11: Comparison between SIRD and SEIRD models. The parameters chosen were the same for both models, except for c=1/5.1c=1/5.1 on the SEIRD. β=0.45\beta=0.45, γ=0.054\gamma=0.054, μ=0.0014\mu=0.0014, k=0.44βk=0.44\beta and N=6000000N=6000000

NN could be understood as the population susceptible to the first pandemic wave, due to the non-homogeneous distribution of the population, not everyone is susceptible to the disease right at the start. With such an interpretation, NN tends to increase with time and approach the total population, here. Comparing predictions generated by the SIRD model with the SEIRD model with age division, the SEIRD model becomes a little more precise, although both simulations fail to predict the slower decrease of Korean data, that might be explained by an increase on NN as time passes, resulting in new cases registered and therefore, slowing down the rate of decrease. Such hypothesis is well acceptable since the Republic of Korea did not adopt any lockdown or social isolation measure, making the disease able to propagate towards other regions, increasing NN with time. Even with better prediction, the SEIRD model is far more complicated than the SIRD model and the use of the later should probably not compromise any data analysis. The same must hold true for simple SIR and SEIR models, when deaths are not a population to be accounted for, instead are just represented with a rate of removal for individuals.

The social isolation model developed here shows good results on the predictions, indicating that the description of β\beta should be close to reality. Here we find a huge advantage of the SEIRD model with age division in comparison with the SIRD model; by including age division, it is possible to simulate the effect of specific non-pharmaceutical interventions, such as school closure, which in principle would decrease βiI\beta_{i_{I}} and βiE\beta_{i_{E}} for the age groups between 0 to 19 years only. Another possibility is to include isolation of only elderly individuals. Several non-pharmaceutical measures have been already described in literature [42], other studies show how the total number of infected might be changed due to the efficiency of non-pharmaceutical measures [43].

Other models might present more complete analysis of the disease, including hospitalizations and even asymptomatic cases, which are difficult to track and seem to vary a lot from place to place, the Diamond Princess cruise ship found 17.9% of asymptomatic infections [44], while an airplane flight found 11.2% of cases being asymptomatic. An Italian village presented 50 to 70% of cases being asymptomatic [45]. There are yet the problem of assuring that the asymptomatic cases registered on studies are really asymptomatic and not presymtomatic, that is, are people still on the incubation period.

Of course, any mathematical model is only as good as the data allows, using mathematical models to describe the disease on countries with low testing rates might yield unrealistic predictions. For example, [46] estimates 86% of infections being undocumented on China, at the early stages of the outbreak.

Another consideration we did not take, was the possibility of reinfection, where individuals leave the recovered group and re-enter the susceptible compartment. However, since other coronaviruses belonging to the same genus betacoronavirus such as the SARS-CoV and the MERS-CoV does not present a high enough mutation rate to cause reinfection in short term [47], the only cause of reinfection would be the loss of antibodies to fight the virus; nevertheless, on both diseases, the infected person acquires antibodies enough to prevent reinfection for a period of 2 - 3 years [48]. With those considerations, we did not assume reinfection was probable on short-term. Future studies may be conducted to study the possibility of reinfection of individuals on the long-term.

6 Conclusion

Mathematical models of a disease outbreak such as the COVID-19 are able to predict the behavior of the infection. Both models have proved to be efficient tools for acquiring data and forecast the future situation. Despite the limitations, the models made it possible to achieve a value of 0\mathcal{R}_{0} in good agreement with other studies, providing evidence in favor of the validity of the model.

However, the present models do not take into consideration the spatial distribution of the population, reflecting on some uncertainties that made the window of prediction larger.

The age division does not change the prediction drastically, suggesting that in the case of a simple prediction or analysis, SIRD models are useful. The age division SEIRD model provides an advantage when requiring specific simulations on specific groups of the population.

References

  • [1] Vineet D Menachery, Boyd L Yount Jr, Kari Debbink, Sudhakar Agnihothram, Lisa E Gralinski, Jessica A Plante, Rachel L Graham, Trevor Scobey, Xing-Yi Ge, Eric F Donaldson, et al. A sars-like cluster of circulating bat coronaviruses shows potential for human emergence. Nature medicine, 21(12):1508, 2015.
  • [2] Roujian Lu, Xiang Zhao, Juan Li, Peihua Niu, Bo Yang, Honglong Wu, Wenling Wang, Hao Song, Baoying Huang, Na Zhu, et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. The Lancet, 395(10224):565–574, 2020.
  • [3] Qun Li, Xuhua Guan, Peng Wu, Xiaoye Wang, Lei Zhou, Yeqing Tong, Ruiqi Ren, Kathy SM Leung, Eric HY Lau, Jessica Y Wong, et al. Early transmission dynamics in wuhan, china, of novel coronavirus–infected pneumonia. New England Journal of Medicine, 2020.
  • [4] World Health Organization et al. Coronavirus disease 2019 (covid-19): situation report, 67. 2020.
  • [5] Marco Cascella, Michael Rajnik, Arturo Cuomo, Scott C Dulebohn, and Raffaela Di Napoli. Features, evaluation and treatment coronavirus (covid-19). In StatPearls [Internet]. StatPearls Publishing, 2020.
  • [6] Vital Surveillances. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19)—china, 2020. China CDC Weekly, 2(8):113–122, 2020.
  • [7] Zunyou Wu and Jennifer M McGoogan. Characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention. Jama, 2020.
  • [8] Joseph T Wu, Kathy Leung, and Gabriel M Leung. Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study. The Lancet, 395(10225):689–697, 2020.
  • [9] Jesús Fernández-Villaverde and Charles I Jones. Estimating and simulating a sird model of covid-19 for many countries, states, and cities. Technical report, National Bureau of Economic Research, 2020.
  • [10] Kentaro Iwata and Chisato Miyakoshi. A simulation on potential secondary spread of novel coronavirus in an exported country using a stochastic epidemic seir model. Journal of Clinical Medicine, 9(4):944, 2020.
  • [11] Affan Shoukat, Chad R Wells, Joanne M Langley, Burton H Singer, Alison P Galvani, and Seyed M Moghadas. Projecting demand for critical care beds during covid-19 outbreaks in canada. CMAJ, 192(19):E489–E496, 2020.
  • [12] Patrick GT Walker, Charles Whittaker, Oliver Watson, M Baguelin, KEC Ainslie, S Bhatia, S Bhatt, A Boonyasiri, O Boyd, L Cattarino, et al. The global impact of covid-19 and strategies for mitigation and suppression. WHO Collaborating Centre for Infectious Disease Modelling, MRC Centre for Global Infectious Disease Analysis, Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, 2020.
  • [13] Jonas Dehning, Johannes Zierenberg, F Paul Spitzner, Michael Wibral, Joao Pinheiro Neto, Michael Wilczek, and Viola Priesemann. Inferring change points in the spread of covid-19 reveals the effectiveness of interventions. Science, 2020.
  • [14] Cleo Anastassopoulou, Lucia Russo, Athanasios Tsakris, and Constantinos Siettos. Data-based analysis, modelling and forecasting of the covid-19 outbreak. PloS one, 15(3):e0230405, 2020.
  • [15] Sunhwa Choi and Moran Ki. Estimating the reproductive number and the outbreak size of covid-19 in korea. Epidemiology and Health, 42, 2020.
  • [16] Pedro Henrique Pinheiro Cintra and Felipe Fontinele Nunes. Estimative of real number of infections by covid-19 on brazil and possible scenarios. medRxiv, 2020.
  • [17] Marco Ferrante, Elisabetta Ferraris, and Carles Rovira. On a stochastic epidemic seihr model and its diffusion approximation. Test, 25(3):482–502, 2016.
  • [18] Igor Nesteruk. Estimations of the coronavirus epidemic dynamics in south korea with the use of sir model. Preprint.] ResearchGate, 2020.
  • [19] Igor Nesteruk. Statistics based predictions of coronavirus 2019-ncov spreading in mainland china. MedRxiv, 2020.
  • [20] Pauline Van den Driessche and James Watmough. Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Mathematical biosciences, 180(1-2):29–48, 2002.
  • [21] Henrik Salje, Cécile Tran Kiem, Noémie Lefrancq, Noémie Courtejoie, Paolo Bosetti, Juliette Paireau, Alessio Andronico, Nathanaël Hoze, Jehanne Richet, Claire-Lise Dubost, et al. Estimating the burden of sars-cov-2 in france. Science, 2020.
  • [22] Sijia Tian, Nan Hu, Jing Lou, Kun Chen, Xuqin Kang, Zhenjun Xiang, Hui Chen, Dali Wang, Ning Liu, Dong Liu, et al. Characteristics of covid-19 infection in beijing. Journal of Infection, 2020.
  • [23] Department of Economic and Social Affairs. World population prospects 2019, 2019.
  • [24] Joël Mossong, Niel Hens, Mark Jit, Philippe Beutels, Kari Auranen, Rafael Mikolajczyk, Marco Massari, Stefania Salmaso, Gianpaolo Scalia Tomba, Jacco Wallinga, et al. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine, 5(3), 2008.
  • [25] Kathy Leung, Mark Jit, Eric HY Lau, and Joseph T Wu. Social contact patterns relevant to the spread of respiratory infectious diseases in hong kong. Scientific reports, 7(1):1–12, 2017.
  • [26] Vipin Gupta, Paul J Hanges, and Peter Dorfman. Cultural clusters: Methodology and findings. Journal of world business, 37(1):11–15, 2002.
  • [27] Tarcisio M Rocha Filho, Fabiana S Ganem dos Santos, Victor B Gomes, Thiago AH Rocha, Julio HR Croda, Walter M Ramalho, and Wildo N Araujo. Expected impact of covid-19 outbreak in a major metropolitan area in brazil. medRxiv, 2020.
  • [28] WorldMeters. Coronavirus cases, 2020.
  • [29] Xi He, Eric HY Lau, Peng Wu, Xilong Deng, Jian Wang, Xinxin Hao, Yiu Chung Lau, Jessica Y Wong, Yujuan Guan, Xinghua Tan, et al. Temporal dynamics in viral shedding and transmissibility of covid-19. Nature medicine, pages 1–4, 2020.
  • [30] Jantien A Backer, Don Klinkenberg, and Jacco Wallinga. Incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from wuhan, china, 20–28 january 2020. Eurosurveillance, 25(5), 2020.
  • [31] Natalie M Linton, Tetsuro Kobayashi, Yichi Yang, Katsuma Hayashi, Andrei R Akhmetzhanov, Sung-mok Jung, Baoyin Yuan, Ryo Kinoshita, and Hiroshi Nishiura. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. Journal of Clinical Medicine, 9(2):538, 2020.
  • [32] Wei-jie Guan, Zheng-yi Ni, Yu Hu, Wen-hua Liang, Chun-quan Ou, Jian-xing He, Lei Liu, Hong Shan, Chun-liang Lei, David SC Hui, et al. Clinical characteristics of coronavirus disease 2019 in china. New England Journal of Medicine, 2020.
  • [33] Stephen A Lauer, Kyra H Grantz, Qifang Bi, Forrest K Jones, Qulu Zheng, Hannah R Meredith, Andrew S Azman, Nicholas G Reich, and Justin Lessler. The incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application. Annals of internal medicine, 2020.
  • [34] Adam Bernheim, Xueyan Mei, Mingqian Huang, Yang Yang, Zahi A Fayad, Ning Zhang, Kaiyue Diao, Bin Lin, Xiqi Zhu, Kunwei Li, et al. Chest ct findings in coronavirus disease-19 (covid-19): relationship to duration of infection. Radiology, page 200463, 2020.
  • [35] Qiurong Ruan, Kun Yang, Wenxia Wang, Lingyu Jiang, and Jianxin Song. Clinical predictors of mortality due to covid-19 based on an analysis of data of 150 patients from wuhan, china. Intensive care medicine, pages 1–3, 2020.
  • [36] Pierfrancesco Barbariol Antonino Bella Stefania Bellino Eva Benelli Luigi Palmieri, Xanthi Andrianou. Characteristics of sars-cov-2 patients dying in italy. report based on available data on may 14th. 2020.
  • [37] Shi Zhao, Qianyin Lin, Jinjun Ran, Salihu S Musa, Guangpu Yang, Weiming Wang, Yijun Lou, Daozhou Gao, Lin Yang, Daihai He, et al. Preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak. International Journal of Infectious Diseases, 92:214–217, 2020.
  • [38] Ying Liu, Albert A Gayle, Annelies Wilder-Smith, and Joacim Rocklöv. The reproductive number of covid-19 is higher compared to sars coronavirus. Journal of travel medicine, 2020.
  • [39] Sheng Zhang, MengYuan Diao, Wenbo Yu, Lei Pei, Zhaofen Lin, and Dechang Chen. Estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: A data-driven analysis. International Journal of Infectious Diseases, 93:201–204, 2020.
  • [40] Jonathan M Read, Jessica RE Bridgen, Derek AT Cummings, Antonia Ho, and Chris P Jewell. Novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions. medRxiv, 2020.
  • [41] SY Tang, YN Xiao, ZH Peng, and HB Shen. Prediction modeling with data fusion and prevention strategy analysis for the covid-19 outbreak. Zhonghua liu xing bing xue za zhi= Zhonghua liuxingbingxue zazhi, 41(4):480–484, 2020.
  • [42] Feng Lin, Kumar Muthuraman, and Mark Lawley. An optimal control theory approach to non-pharmaceutical interventions. BMC infectious diseases, 10(1):32, 2010.
  • [43] Neil M Ferguson, Daniel Laydon, Gemma Nedjati-Gilani, Natsuko Imai, Kylie Ainslie, Marc Baguelin, Sangeeta Bhatia, Adhiratha Boonyasiri, Zulma Cucunubá, Gina Cuomo-Dannenburg, et al. Impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand. 2020.
  • [44] Kenji Mizumoto, Katsushi Kagaya, Alexander Zarebski, and Gerardo Chowell. Estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship, yokohama, japan, 2020. Eurosurveillance, 25(10):2000180, 2020.
  • [45] Michael Day. Covid-19: identifying and isolating asymptomatic people helped eliminate virus in italian village. Bmj, 368:m1165, 2020.
  • [46] Ruiyun Li, Sen Pei, Bin Chen, Yimeng Song, Tao Zhang, Wan Yang, and Jeffrey Shaman. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2). Science, 2020.
  • [47] Zhongming Zhao, Haipeng Li, Xiaozhuang Wu, Yixi Zhong, Keqin Zhang, Ya-Ping Zhang, Eric Boerwinkle, and Yun-Xin Fu. Moderate mutation rate in the sars coronavirus genome and its implications. BMC evolutionary biology, 4(1):21, 2004.
  • [48] Wei Liu, Arnaud Fontanet, Pan-He Zhang, Lin Zhan, Zhong-Tao Xin, Laurence Baril, Fang Tang, Hui Lv, and Wu-Chun Cao. Two-year prospective study of the humoral immune response of patients with severe acute respiratory syndrome. The Journal of infectious diseases, 193(6):792–795, 2006.