This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Epidemiological Forecasting with Model Reduction of Compartmental Models.
Application to the COVID-19 pandemic.

Athmane Bakhta Thomas Boiveau Yvon Maday Olga Mula
Abstract

We propose a forecasting method for predicting epidemiological health series on a two-week horizon at the regional and interregional resolution. The approach is based on model order reduction of parametric compartmental models, and is designed to accommodate small amount of sanitary data. The efficiency of the method is shown in the case of the prediction of the number of infected and removed people during the two pandemic waves of COVID-19 in France, which have taken place approximately between February and November 2020. Numerical results illustrate the promising potential of the approach.

1 Introduction

Providing reliable epidemiological forecasts during an ongoing pandemic is crucial to mitigate the potentially disastrous consequences on global public health and economy. As the ongoing pandemic on COVID-19 sadly illustrates, this is a daunting task in the case of new diseases due to the incomplete knowledge of the behavior of the disease, and the heterogeneities and uncertainties in the health data count. Despite these difficulties, many forecasting strategies exist and we could cast them into two main categories. The first one is purely data-based, and involves statistical and learning methods such as time series analysis, multivariate linear regression, grey forecasting or neural networks [7, 1, 11, 28, 19]. The second approach uses epidemiological models which are appealing since they provide an interpretable insight of the mechanisms of the outbreak. They also provide high flexibility in the level of detail to describe the evolution of the pandemic, ranging from simple compartmental models that divide the population into a few exclusive categories, to highly detailed descriptions involving numerous compartments or even agent-based models (see, e.g., [16, 23, 6] for general references on mathematical epidemiological models and [22, 10, 13] for some models focused on Covid-19). One salient drawback of using epidemiological models for forecasting purposes lies in the very high uncertainty in the estimation of the involved parameters. This is due to the fact that most of the time the parameters cannot be inferred from real observations and the available data are insufficient or too noisy to provide any reliable estimation. The situation is aggravated by the fact that the number of parameters can quickly become large even in moderately simple compartmental models [10]. As a result, forecasting with these models involves making numerous a priori hypothesis which can sometimes be hardly justified by data observations.

In this paper, our goal is to forecast the time-series of infected, removed and dead patients with compartmental models that involve as few parameters as possible in order to infer them solely from the data. The available data are only given for hospitalized people, one can nevertheless estimate the total number of infected people through an adjustment factor hinted from the literature. Such a factor takes into account the proportion of asymptomatic people and infected people who do not go to hospital. The model involving the least number of parameters is probably the SIR model [17] which is based on a partition of the population into:

  • Uninfected people, called susceptible (S),

  • Infected and contagious people (I), with more or less marked symptoms,

  • People removed (R) from the infectious process, either because they are cured or unfortunately died after being infected.

If NN denotes the total population size that we assume to be constant over a certain time interval [0,T][0,T], we have

N=S(t)+I(t)+R(t),t[0,T],N=S(t)+I(t)+R(t),\forall t\in[0,T],

and the evolution from SS to II and from II to RR is given for all t[0,T]t\in[0,T] by

dSdt(t)\displaystyle\frac{dS}{dt}(t) =βI(t)S(t)N\displaystyle=-\frac{\beta I(t)S(t)}{N}
dIdt(t)\displaystyle\frac{dI}{dt}(t) =βI(t)S(t)NγI(t)\displaystyle=\frac{\beta I(t)S(t)}{N}-\gamma I(t)
dRdt(t)\displaystyle\frac{dR}{dt}(t) =γI(t).\displaystyle=\gamma I(t).

The SIR model has only two parameters:

  • γ>0\gamma>0 represents the recovery rate. In other words, its inverse γ1\gamma^{-1} can be interpreted as the length (in days) of the contagious period.

  • β>0\beta>0 is the transmission rate of the disease. It essentially depends on two factors: the contagiousness of the disease and the contact rate within the population. The larger this second parameter is, the faster the transition from susceptible to infectious will be. As a consequence, the number of hospitalized patients may increase very fast, and may lead to a collapse of the health system [2]. Strong distancing measures like confinement can effectively act on this parameter [29, 12], helping to keep it low.

Our forecasting strategy is motivated by the following observation: by allowing the parameters β\beta and γ\gamma to be time-dependent, we can find optimal coefficients β(t)\beta^{*}(t) and γ(t)\gamma^{*}(t) that exactly fit any series of infected and removed patients. In other words, we can perfectly fit any observed health series with a SIR model having time-dependent coefficients.

As we explain later on in the paper, the high fitting power stems from the fact that the parameters β\beta and γ\gamma are searched in L([0,T],+)L^{\infty}([0,T],\mathbb{R}_{+}), the space of essentially bounded measurable functions. For our forecasting purposes, this space is however too large to give any predictive power and we need to find a smaller manifold that has simultaneously good fitting and forecasting properties. To this aim, we develop a method based on model order reduction. The idea is to find a space of reduced dimension that can host the dynamics of the current epidemic. This reduced space is learnt from a series of detailed compartmental models based on precise underlying mechanisms of the disease. One major difficulty in these models is to fit the correct parameters. In our approach, we do not seek to estimate them. Instead, we consider a large range of possible parameter configurations with a uniform sampling that allows us to simulate virtual epidemic scenarios on a longer range than the fitting window [0,T][0,T]. We next cast each virtual epidemics from the family of detailed compartmental models into the family of SIR models with time-dependent coefficients in a sense that we explain later on. This procedure yields time-dependent parameters β\beta and γ\gamma for each detailed virtual epidemics. The set of all such β\beta (resp. γ\gamma) is then condensed into a reduced basis of small dimension. We finally use the available health data on the time window [0,T][0,T] to find the functions β\beta and γ\gamma from the reduced space that fit at best the current epidemics over [0,T][0,T]. Since the reduced basis functions are defined over a longer time range [0,T+τ][0,T+\tau] with τ>0\tau>0 (say, e.g., two weeks), the strategy automatically provides forecasts from TT to T+τT+\tau. Its accuracy will be related to the pertinence of the mechanistic mathematical models that have been used in the learning process.

We note that an important feature of our approach is that all virtual simulations are considered equally important on a first stage and the procedure automatically learns what are the best scenarios (or linear combinations of scenarios) to describe the available data. Moreover, the approach even mixes different compartmental models to accommodate these available data. This is in contrast to other existing approaches which introduce a strong a priori belief on the quality of a certain particular model. Note also that we can add models related to other illness, and use the large manifold to fit a possible new epidemic. It is also possible to mix the current approach with other purely statistical or learning strategies by means of expert aggregration. One salient difference with these approaches which is important to emphasize is that our method hinges on highly detailed compartmental models which have clear epidemiological interpretations. Our collapsing methodology into the time-dependent SIR is a way of “summarizing” the dynamics into a few terms. One may expect that reducing the number of parameters comes at the cost of losing interpretability of parameters, and this is in general true. Nevertheless, the numerical results of the present work show that a reasonable tradeoff between "reduction of the number of parameters" and "interpretability of these few parameters" can be achieved.

The paper is organized as follows. In Section 2, we present the forecasting method in the case of one single region with constant population. For this, we briefly introduce in Section 2.1 the epidemiological models involved in the procedure, namely the SIR model with time-dependent coefficients and more detailed compartmental models used for the training step. In Section 2.2, after proving that SIR models with time-dependent coefficients in L([0,T])L^{\infty}([0,T]) is able to fit any admissible epidemiological evolution (in a sense that will be made rigorous), we present the main steps of the forecasting method. The method involves a collapsing step from detailed models to SIR models with time-dependent coefficients, and model reduction techniques. We detail these points in Sections 2.3 and 2.4. In Section 3 we explain how the method can easily be extended to a multi-regional context involving population mobility, and regional health data observations (provided, of course, that mobility data is available). We start in Section 3.1 by clarifying that the nature of the mobility data will dictate the kind of multi-regional SIR model to use in this context. In Section 3.2 we outline how to adapt the main steps of the method to the multi-regional case. Finally, in Section 4, we present numerical results for the the two pandemic waves of COVID-19 in France in year 2020, which have taken place approximately between February and November 2020. Concluding comments are given in Section 5 followed by two appendices A and B that present details about the processing of the data noise, and the forecasting error.

2 Methodology for one single region

For the sake of clarity, we first consider the case of one single region with constant population and no population fluxes with other regions. Here, the term region is generic and may be applied to very different geographical scales, ranging from a full country, to a department within the country, or even smaller partitions of a territory.

2.1 Compartmental models

The final output of our method is a mono-regional SIR model with time dependent coefficients as exposed below. This SIR model with time-dependent coefficients will be evaluated with reduced modeling techniques involving a large family of models with finer compartments proposed in the literature. Before presenting the method in the next section, we introduce here all the models that we use in this paper along with useful notations for the rest of the paper.

SIR models with time-dependent parameters:

We will fit and forecast the series of infected and removed patients (dead and recovered) with SIR models where the coefficients β\beta and γ\gamma are time-dependent:

dSdt(t)\displaystyle\frac{dS}{dt}(t) =β(t)I(t)S(t)N\displaystyle=-\frac{\beta(t)I(t)S(t)}{N}
dIdt(t)\displaystyle\frac{dI}{dt}(t) =β(t)I(t)S(t)Nγ(t)I(t)\displaystyle=\frac{\beta(t)I(t)S(t)}{N}-\gamma(t)I(t)
dRdt(t)\displaystyle\frac{dR}{dt}(t) =γ(t)I(t).\displaystyle=\gamma(t)I(t).

In the following, we use bold-faced letters for past-time quantities. For example, f{f(t): 0tT}{\textbf{f}}\coloneqq\{f(t)\,:\,0\leq t\leq T\} for any function fL([0,T]){\textbf{f}}\in L^{\infty}([0,T]). Using this notation, for any given 𝜷{\bm{\beta}} and 𝜸L([0,T]){\bm{\gamma}}\in L^{\infty}([0,T]) we denote by

(S,I,R)=SIR(𝜷,𝜸,[0,T])({\textbf{S}},{\textbf{I}},{\textbf{R}})=\texttt{SIR}({\bm{\beta}},{\bm{\gamma}},[0,T])

the solution of the associated SIR dynamics in [0,T][0,T].

Detailed compartmental models:

Models involving many compartments offer a detailed description of epidemiological mechanisms at the expense of involving many parameters. In our approach, we use them to generate virtual scenarios. One of the initial motivations behind the present work is to provide forecasts for the COVID-19 pandemic, thus we have selected the two following models which are specific for this disease, but note that any other compartmental model [18, 10, 22] or agent-based simulation could also be used .

  • First model, SEI55CHRD: This model is inspired from the one proposed in [10]. It involves 11 different compartments and a set of 19 parameters. The dynamics of the model is illustrated in Figure 1 and the system of equations reads

    dSdt(t)\displaystyle\frac{dS}{dt}(t) =1NS(t)(βpIp(t)+βaIa(t)+βpsIps(t)+βmsIms(t)+βssIss(t)+βHH(t)+βCC(t))\displaystyle=-\frac{1}{N}S(t)\left(\beta_{p}I_{p}(t)+\beta_{a}I_{a}(t)+\beta_{ps}I_{ps}(t)+\beta_{ms}I_{ms}(t)+\beta_{ss}I_{ss}(t)+\beta_{H}H(t)+\beta_{C}C(t)\right)
    dEdt(t)\displaystyle\frac{dE}{dt}(t) =1NS(t)(βpIp(t)+βaIa(t)+βpsIps(t)+βmsIms(t)+βssIss(t)+βHH(t)+βCC(t))εE(t)\displaystyle=\frac{1}{N}S(t)\left(\beta_{p}I_{p}(t)+\beta_{a}I_{a}(t)+\beta_{ps}I_{ps}(t)+\beta_{ms}I_{ms}(t)+\beta_{ss}I_{ss}(t)+\beta_{H}H(t)+\beta_{C}C(t)\right)-\varepsilon E(t)
    dIpdt(t)\displaystyle\frac{dI_{p}}{dt}(t) =εE(t)μpIp(t)\displaystyle=\varepsilon E(t)-\mu_{p}I_{p}(t)
    dIadt(t)\displaystyle\frac{dI_{a}}{dt}(t) =paμpIp(t)μIa(t)\displaystyle=p_{a}\mu_{p}I_{p}(t)-\mu I_{a}(t)
    dIpsdt(t)\displaystyle\frac{dI_{ps}}{dt}(t) =pps(1pa)μpIp(t)μIps(t)\displaystyle=p_{ps}(1-p_{a})\mu_{p}I_{p}(t)-\mu I_{ps}(t)
    dImsdt(t)\displaystyle\frac{dI_{ms}}{dt}(t) =pms(1pa)μpIp(t)μIms(t)\displaystyle=p_{ms}(1-p_{a})\mu_{p}I_{p}(t)-\mu I_{ms}(t)
    dIssdt(t)\displaystyle\frac{dI_{ss}}{dt}(t) =pss(1pa)μpIp(t)μIss(t)\displaystyle=p_{ss}(1-p_{a})\mu_{p}I_{p}(t)-\mu I_{ss}(t)
    dCdt(t)\displaystyle\frac{dC}{dt}(t) =pcμIss(t)(λC,R+λC,D)C(t)\displaystyle=p_{c}\mu I_{ss}(t)-(\lambda_{C,R}+\lambda_{C,D})C(t)
    dHdt(t)\displaystyle\frac{dH}{dt}(t) =(1pc)μIss(t)(λH,R+λH,D)H(t)\displaystyle=(1-p_{c})\mu I_{ss}(t)-(\lambda_{H,R}+\lambda_{H,D})H(t)
    dRdt(t)\displaystyle\frac{dR}{dt}(t) =λC,RC(t)+λH,RH(t)\displaystyle=\lambda_{C,R}C(t)+\lambda_{H,R}H(t)
    dDdt(t)\displaystyle\frac{dD}{dt}(t) =λC,DC(t)+λH,DH(t)\displaystyle=\lambda_{C,D}C(t)+\lambda_{H,D}H(t)

    The different parameters involved in the model are described in Table 2 and detailed in the appendix of [10].

    IpsI_{ps}ImsI_{ms}IssI_{ss}IaI_{a}IpI_{p}EESSCCDDRRHH
    Figure 1: SEI55CHRD model
    Compartment Description
    SS Susceptible
    EE Exposed (non infectious)
    IpI_{p} Infected and pre-symptomatic (already infectious)
    IaI_{a} Infected and a-symptomatic (but infectious)
    IpsI_{ps} Infected and paucisymptomatic
    ImsI_{ms} Infected with mild symptoms
    IssI_{ss} Infected with severe symptoms
    HH Hospitalized
    CC Intensive Care Unit
    RR Removed
    DD Dead
    Table 1: Description of the compartments in Model SEI55CHRD
    Parameter Description
    βp\beta_{p} Relative infectiousness of IpI_{p}
    βa\beta_{a} Relative infectiousness of IaI_{a}
    βps\beta_{ps} Relative infectiousness of IpsI_{ps}
    βms\beta_{ms} Relative infectiousness of ImsI_{ms}
    βss\beta_{ss} Relative infectiousness of IssI_{ss}
    βH\beta_{H} Relative infectiousness of IHI_{H}
    βC\beta_{C} Relative infectiousness of ICI_{C}
    ε1\varepsilon^{-1} Latency period
    μp1\mu_{p}^{-1} Duration of prodromal phase
    pap_{a} Probability of being asymptomatic
    μ1\mu^{-1} Infectious period of IaI_{a}, IpsI_{ps}, ImsI_{ms}, IssI_{ss}
    ppsp_{ps} If symptomatic, probability of being paucisymptomatic
    pmsp_{ms} If symptomatic, probability of developing mild symptoms
    pssp_{ss} If symptomatic, probability of developing severe symptoms (note that pps+pms+pss=1p_{ps}+p_{ms}+p_{ss}=1)
    pCp_{C} If severe symptoms, probability of going in C
    λCR\lambda_{CR} If in C, daily rate entering in RR
    λCD\lambda_{CD} If in C, daily rate entering in DD
    λHR\lambda_{HR} If hospitalized, daily rate entering in RR
    λHD\lambda_{HD} If hospitalized, daily rate entering in DD
    Table 2: Description of the parameters involved in Model SEI55CHRD

    We denote by

    u=SEI5CHRD(u0,βp,\displaystyle{\textbf{u}}=\texttt{SEI$5$CHRD}({\textbf{u}}_{0},\beta_{p}, βa,βps,βms,βss,βH,βC,\displaystyle\beta_{a},\beta_{ps},\beta_{ms},\beta_{ss},\beta_{H},\beta_{C}, (1)
    ε,μp,pa,μ,pps,pms,pss,pC,\displaystyle\varepsilon,\mu_{p},p_{a},\mu,p_{ps},p_{ms},p_{ss},p_{C}, (2)
    λCR,λCD,λHR,λHD,[0,T])\displaystyle\lambda_{CR},\lambda_{CD},\lambda_{HR},\lambda_{HD},[0,T]) (3)

    the parameter-to-solution map where u=(S,E,Ip,Ia,Ips,Ims,Iss,C,H,R,D){\textbf{u}}=({\textbf{S}},{\textbf{E}},{\textbf{I${}_{p}$}},{\textbf{I${}_{a}$}},{\textbf{I${}_{ps}$}},{\textbf{I${}_{ms}$}},{\textbf{I${}_{ss}$}},{\textbf{C}},{\textbf{H}},{\textbf{R}},{\textbf{D}}).

  • Second model, SE22IUR: This model is a variant of the one proposed in [22]. It involves 5 different compartments and a set of 6 parameters. The dynamics of the model is illustrated in Figure 2 and the set of equations is

    dSdt(t)\displaystyle\frac{dS}{dt}(t) =1NβS(t)(E2(t)+U(t)+I(t))\displaystyle=-\frac{1}{N}\beta S(t)(E_{2}(t)+U(t)+I(t))
    dE1dt(t)\displaystyle\frac{dE_{1}}{dt}(t) =1NβS(t)(E2(t)+U(t)+I(t))δE1(t)\displaystyle=\frac{1}{N}\beta S(t)(E_{2}(t)+U(t)+I(t))-\delta E_{1}(t)
    dE2dt(t)\displaystyle\frac{dE_{2}}{dt}(t) =δE1(t)σE2(t)\displaystyle=\delta E_{1}(t)-\sigma E_{2}(t)
    dIdt(t)\displaystyle\frac{dI}{dt}(t) =νσE2(t)γ1I(t)\displaystyle=\nu\sigma E_{2}(t)-\gamma_{1}I(t)
    dUdt(t)\displaystyle\frac{dU}{dt}(t) =(1ν)σE2(t)γ2U(t)\displaystyle=(1-\nu)\sigma E_{2}(t)-\gamma_{2}U(t)
    dRdt(t)\displaystyle\frac{dR}{dt}(t) =γ1I(t)+γ2U(t)\displaystyle=\gamma_{1}I(t)+\gamma_{2}U(t)
    IIUURRE2E_{2}E1E_{1}SS
    Figure 2: SE22IUR model

    We denote by

    u=SE2IUR(u0,β,δ,σ,ν,γ1,γ2,[0,T]){\textbf{u}}=\texttt{SE$2$IUR}({\textbf{u}}_{0},\beta,\delta,\sigma,\nu,\gamma_{1},\gamma_{2},[0,T])

    the parameter-to-solution map where u=(S,E1,E2,I,U,R){\textbf{u}}=({\textbf{S}},{\textbf{E1}},{\textbf{E2}},{\textbf{I}},{\textbf{U}},{\textbf{R}}). The different parameters involved in the model are described in Table 4.

    Compartment Description
    SS Susceptible
    E1E_{1} Exposed (non infectious)
    E2E_{2} Infected and pre-symptomatic (already infectious)
    II Infected and symptomatic
    UU Un-noticed
    RR dead and removed
    Table 3: Description of the compartments in Model SE22IUR
    Parameter Description
    β\beta Relative infectiousness of II, UU, E2E_{2}
    δ1\delta^{-1} Latency period
    σ1\sigma^{-1} Duration of prodromal phase
    ν\nu Proportion of II among I+UI+U
    γ1\gamma_{1} If II, daily rate entering in RR
    γ2\gamma_{2} If UU, daily rate entering in RR
    Table 4: Description of the parameters involved in Model SE22IUR
  • Generalization: In the following, we abstract the procedure as follows. For a any Detailed_Model with dd compartments involving a vector μp\mu\in\mathbb{R}^{p} of pp parameters, we denote by

    u(μ)=Detailed_Model(μ,[0,T~]),u(μ)L([0,T~],d){\textbf{u}}(\mu)=\texttt{Detailed\_Model}(\mu,[0,\widetilde{T}]),\quad{\textbf{u}}(\mu)\in L^{\infty}([0,\widetilde{T}],\mathbb{R}^{d})

    the parameter-to-solution map, where T~\widetilde{T} is some given time simulation that can be as large as wished because this is a virtual scenario. Note that, because the initial condition of the illness at time 0 is not known, we include in the parameter set the initial condition u0{\textbf{u}}_{0}.

2.2 Forecasting based on model reduction of detailed models

We assume that we are given health data in a time window [0,T][0,T], where T>0T>0 is assumed to be the present time. The observed data is the series of infected people, denoted IobsI_{\text{obs}}, and removed people denoted RobsR_{\text{obs}}. They are usually given at a national or a regional scale and on a daily basis. For our discussion, it will be useful to work with time-continuous functions and tIobs(t)t\to I_{\text{obs}}(t) will denote the piecewise constant approximation in [0,T][0,T] from the given data (and similarly for Robs(t)R_{\text{obs}}(t)). Our goal is to give short-term forecasts of the series in a time window τ>0\tau>0 whose size will be about two weeks. We denote by I(t)I(t) and R(t)R(t) the approximations to the series Iobs(t)I_{\text{obs}}(t) and Robs(t)R_{\text{obs}}(t) at any time t[0,T]t\in[0,T].

As already brought up, we propose to fit the data and provide forecasts with SIR models with time-dependent parameters 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}. The main motivation for using such a simple family is because it possesses optimal fitting and forecasting properties for our purposes in the sense that we explain next. Defining the cost function

𝒥(𝜷,𝜸|Iobs(t),Robs(t),[0,T])0T(|I(t)Iobs(t)|2+|R(t)Robs(t)|2)dt\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\,|\,I_{\text{obs}}(t),R_{\text{obs}}(t),[0,T])\coloneqq\int_{0}^{T}\left(|I(t)-I_{\text{obs}}(t)|^{2}+|R(t)-R_{\text{obs}}(t)|^{2}\right)\mathrm{d}t (4)

such that

(S,I,R)=SIR(𝜷,𝜸,[0,T]),({\textbf{S}},{\textbf{I}},{\textbf{R}})=\texttt{SIR}({\bm{\beta}},{\bm{\gamma}},[0,T]),

the fitting problem can be expressed at the continuous level as the optimal control problem of finding

J=inf(𝜷,𝜸)L([0,T])×L([0,T])𝒥(𝜷,𝜸|Iobs,Robs,[0,T]).J^{*}=\inf_{({\bm{\beta}},{\bm{\gamma}})\in L^{\infty}([0,T])\times L^{\infty}([0,T])}\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\,|\,{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}},[0,T]). (5)

The following result ensures the existence of a unique minimizer under very mild constraints.

Proposition 2.1.

Let NN\in\mathbb{N}^{*} and T>0T>0. For any real-valued functions Sobs,Iobs,RobsS_{\text{obs}},I_{\text{obs}},R_{\text{obs}} of class 𝒞1{\mathcal{C}}^{1}, defined on [0,T][0,T] satisfying

  1. (i)

    Sobs(t)+Iobs(t)+Robs(t)=NS_{\text{obs}}(t)+I_{\text{obs}}(t)+R_{\text{obs}}(t)=N for every t[0,T]t\in[0,T],

  2. (ii)

    SobsS_{\text{obs}} in nonincreasing on [0,T][0,T],

  3. (iii)

    RobsR_{\text{obs}} is nondeacreasing on [0,T][0,T],

there exists a unique minimizer (𝛃obs,𝛄obs)({\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}}) to problem (5).

Proof.

One can set

{βobs(t)NIobs(t)Sobs(t)dSobsdt(t)γobs(t)1Iobs(t)dRobsdt(t)\displaystyle\begin{cases}\beta^{*}_{\text{obs}}(t)&\coloneqq-\dfrac{N}{I_{\text{obs}}(t)S_{\text{obs}}(t)}\dfrac{dS_{\text{obs}}}{dt}(t)\\ \gamma^{*}_{\text{obs}}(t)&\coloneqq\dfrac{1}{I_{\text{obs}}(t)}\dfrac{dR_{\text{obs}}}{dt}(t)\end{cases} (6)

so that

(Sobs,Iobs,Robs)=SIR(𝜷,𝜸,[0,T])({\textbf{S}}_{\text{obs}},{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}})=\texttt{SIR}({\bm{\beta}}^{*},{\bm{\gamma}}^{*},[0,T])

and

𝒥(𝜷obs,𝜸obs,[0,T])=0\mathcal{J}({\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}},[0,T])=0

which obviously implies that J=0J^{*}=0. ∎

Note that one can equivalently set

{βobs(t)NIobs(t)Sobs(t)dSobsdt(t)γobs(t)1Iobs(t)[dIobsdt(t)βobs(t)Iobs(t)Sobs(t)N]\displaystyle\begin{cases}\beta^{*}_{\text{obs}}(t)&\coloneqq-\dfrac{N}{I_{\text{obs}}(t)S_{\text{obs}}(t)}\dfrac{dS_{\text{obs}}}{dt}(t)\\ \gamma^{*}_{\text{obs}}(t)&\coloneqq\dfrac{1}{I_{\text{obs}}(t)}\left[\dfrac{dI_{\text{obs}}}{dt}(t)-\dfrac{\beta^{*}_{\text{obs}}(t)I_{\text{obs}}(t)S_{\text{obs}}(t)}{N}\right]\end{cases} (7)

or again

{γobs(t)1Iobs(t)dRobsdt(t)βobs(t)NIobs(t)Sobs(t)[dIobsdt(t)γobs(t)Iobs(t)]\displaystyle\begin{cases}\gamma^{*}_{\text{obs}}(t)&\coloneqq\dfrac{1}{I_{\text{obs}}(t)}\dfrac{dR_{\text{obs}}}{dt}(t)\\ \beta^{*}_{\text{obs}}(t)&\coloneqq\dfrac{N}{I_{\text{obs}}(t)S_{\text{obs}}(t)}\left[\dfrac{dI_{\text{obs}}}{dt}(t)-\gamma^{*}_{\text{obs}}(t)I_{\text{obs}}(t)\right]\end{cases} (8)

This simple observation means that there exists a time-dependent SIR model which can perfectly fit the data of any (epidemiological) evolution that satisfies properties (i), (ii) and (iii). In particular, we can perfectly fit the series of sick people with a time-dependent SIR model (modulo a smoothing of the local oscillations due to noise). Since the health data are usually given on a daily basis, we can approximate 𝜷obs,𝜸obs{\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}} by approximating the derivatives by classical finite differences in equation (6).

The fact that we can build 𝜷obs{\bm{\beta}}^{*}_{\text{obs}} and 𝜸obs{\bm{\gamma}}^{*}_{\text{obs}} such that 𝒥(𝜷obs,𝜸obs)=J=0\mathcal{J}({\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}})=J^{*}=0 implies that the family of time-dependent SIR models is rich enough not only to fit the evolution of any epidemiological series, but also to deliver perfect predictions of the health data. However, this great approximation power comes at the cost of defining the parameters 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} in L([0,T])L^{\infty}([0,T]) which is a space that is too large in order to be able to define any feasible prediction strategy.

In order to pin down a smaller manifold where these parameters may vary without sacrificing much on the fitting and forecasting power, our strategy is the following:

  1. 1.

    Learning phase: The fundamental hypothesis of our approach is the confidence that the specialists of epidemiology have well understood the mechanisms of infection transmission. The second hypothesis is that these accurate models suffer from the proper determination of the parameters they contain. We thus propose to generate a large number of virtual epidemics, following these mechanistic models with parameters that can be chosen in a vicinity of the suggested parameters values, including the different initial conditions.

    1. (a)

      Generate virtual scenarios using detailed models with constant coefficients:

      • Define the notion of Detailed_Model which is most appropriate for the epidemiological study. Several models could be considered. For our numerical application, the detailed models were defined in Section 2.1.

      • Define an interval range 𝒫p\mathcal{P}\subset\mathbb{R}^{p} where the parameters μ\mu of Detailed_Model will vary. We call the solution manifold 𝒰\mathcal{U} the set of virtual dynamics over [0,T+τ][0,T+\tau], namely

        𝒰{u(μ)=Detailed_Model(μ,[0,T+τ]):μ𝒫}.\mathcal{U}\coloneqq\{{\textbf{u}}(\mu)=\texttt{Detailed\_Model}(\mu,[0,T+\tau])\;:\;\mu\in\mathcal{P}\}. (9)
      • Draw a finite training set

        𝒫tr={μ1,,μK}𝒫\mathcal{P}_{{\text{tr}}}=\{\mu_{1},\dots,\mu_{K}\}\subseteq\mathcal{P} (10)

        of K1K\gg 1 parameter instances and we compute u(μ)=Detailed_Model(μ,[0,T+τ]){\textbf{u}}(\mu)=\texttt{Detailed\_Model}(\mu,[0,T+\tau]) for μ𝒫tr\mu\in\mathcal{P}_{{\text{tr}}}. Each u(μ){\textbf{u}}(\mu) is a virtual epidemiological scenario. An important detail for our prediction purposes is that the simulations are done in [0,T+τ][0,T+\tau], that is, we simulate not only in the fitting time interval but also in the prediction time interval. We call

        𝒰tr={u(μ):μ𝒫tr}\mathcal{U}_{\text{tr}}=\{{\textbf{u}}(\mu)\;:\;\mu\in\mathcal{P}_{{\text{tr}}}\}

        the training set of all virtual scenarios.

    2. (b)

      Collapse: Collapse every detailed model u(μ)𝒰tr{\textbf{u}}(\mu)\in\mathcal{U}_{\text{tr}} into a SIR model following the ideas which we explain in Section 2.3. For every u(μ){\textbf{u}}(\mu), the procedure gives time-dependent parameters 𝜷(μ){\bm{\beta}}(\mu) and 𝜸(μ){\bm{\gamma}}(\mu) and associated SIR solutions (S,I,R)(μ)({\textbf{S}},{\textbf{I}},{\textbf{R}})(\mu) in [0,T+τ][0,T+\tau]. This yields the sets

      tr{𝜷(μ):μ𝒫tr}and𝒢tr{𝜸(μ):μ𝒫tr}.\mathcal{B}_{\text{tr}}\coloneqq\{{\bm{\beta}}(\mu)\;:\;\mu\in\mathcal{P}_{\text{tr}}\}\quad\text{and}\quad\mathcal{G}_{\text{tr}}\coloneqq\{{\bm{\gamma}}(\mu)\;:\;\mu\in\mathcal{P}_{\text{tr}}\}. (11)
    3. (c)

      Compute reduced models:

      We apply model reduction techniques using tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}} as training sets in order to build two basis

      Bn=span{b1,,bn},Gn=span{g1,,gn}L([0,T+τ],),\mathrm{B}_{n}=\mathrm{span}\{b_{1},\dots,b_{n}\},\quad\mathrm{G}_{n}=\mathrm{span}\{g_{1},\dots,g_{n}\}\subset L^{\infty}([0,T+\tau],\mathbb{R}),

      which are defined over [0,T+τ][0,T+\tau]. The space Bn\mathrm{B}_{n} is such that it approximates well all functions 𝜷(μ)tr{\bm{\beta}}(\mu)\in\mathcal{B}_{\text{tr}} (resp. all 𝜸(μ)𝒢tr{\bm{\gamma}}(\mu)\in\mathcal{G}_{\text{tr}} can be well approximated by elements of Gn\mathrm{G}_{n}). We outline in Section 2.4 the methods we have explored in our numerical tests.

  2. 2.

    Fitting on the reduced spaces: We next solve the fitting problem (5) in the interval [0,T][0,T] by searching 𝜷{\bm{\beta}} (resp. 𝜸{\bm{\gamma}}) in Bn\mathrm{B}_{n} (resp. in Gn\mathrm{G}_{n}) instead of in L([0,T])L^{\infty}([0,T]), that is,

    J(Bn,Gn)=min(𝜷,𝜸)Bn×Gn𝒥(𝜷,𝜸|Iobs,Robs,[0,T]).J^{*}_{(\mathrm{B}_{n},\mathrm{G}_{n})}=\min_{({\bm{\beta}},{\bm{\gamma}})\in\mathrm{B}_{n}\times\mathrm{G}_{n}}\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\;|\;{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}},[0,T]). (12)

    Note that the respective dimensions of Bn\mathrm{B}_{n} and Gn\mathrm{G}_{n} can be different, for simplicity we take them equal in the following. Obviously, since Bn\mathrm{B}_{n} and GnL([0,T])\mathrm{G}_{n}\subset L^{\infty}([0,T]), we have that

    JJ(Bn,Gn),J^{*}\leq J^{*}_{(\mathrm{B}_{n},\mathrm{G}_{n})},

    but we numerically observe that the function nJ(Bn,Gn)n\mapsto J^{*}_{(\mathrm{B}_{n},\mathrm{G}_{n})} decreases very rapidly as nn increases, which indirectly confirms the fact that the manifold generated by the two above models accommodates well the COVID-19 epidemics.

    The solution of problem (12) gives us coefficients (ci)i=1n(c^{*}_{i})_{i=1}^{n} and (c~i)i=1nn(\tilde{c}^{*}_{i})_{i=1}^{n}\in\mathbb{R}^{n} such that the time-dependent parameters

    βn(t)\displaystyle\beta^{*}_{n}(t) =i=1ncibi(t),t[0,T+τ],\displaystyle=\sum_{i=1}^{n}c^{*}_{i}b_{i}(t),\quad\forall t\in[0,T+\tau],
    γn(t)\displaystyle\gamma^{*}_{n}(t) =i=1nc~igi(t).\displaystyle=\sum_{i=1}^{n}\tilde{c}^{*}_{i}g_{i}(t).

    achieve the minimum (12).

  3. 3.

    Forecast: For a given dimension nn of the reduced spaces, propagate in [0,T+τ][0,T+\tau] the associated SIR model

    (Sn,In,Rn)=SIR(𝜷n,𝜸n,[0,T+τ])({\textbf{S}}_{n}^{*},{\textbf{I}}_{n}^{*},{\textbf{R}}_{n}^{*})=\texttt{SIR}({\bm{\beta}}^{*}_{n},{\bm{\gamma}}^{*}_{n},[0,T+\tau])

    The values In(t)I_{n}^{*}(t) and Rn(t)R_{n}^{*}(t) for t[0,T[t\in[0,T[ are by construction close to the observed data Iobs,Robs{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}} (up to some numerical optimization error). The values In(t)I_{n}^{*}(t) and Rn(t)R_{n}^{*}(t) for t[T,T+τ]t\in[T,T+\tau] are then used for prediction.

  4. 4.

    Forecast Combination/Aggregation of Experts (optional step): By varying the dimension nn and using different model reduction approaches, we can easily produce a collection of different forecasts and the question of how to select the best predictive model arises. Alternatively, we can also resort to Forecast Combination techniques [26]: denoting (I1,R1),,(IP,RP)(I_{1},R_{1}),\dots,(I_{P},R_{P}) the different forecasts, the idea is to search for an appropriate linear combination

    IFC(t)=p=1PwpIp(t)I^{\text{FC}}(t)=\sum_{p=1}^{P}w_{p}I_{p}(t)

    and similarly for RR. Note that these combinations do not need to involve forecasts from our methodology only. Other approaches like time series forecasts could also be included. One simple forecast combination is the average, in which all alternative forecasts are given the same weight wp=1/P,p=1,Pw_{p}=1/P,\,p=1,\dots P. More elaborate approaches consist in estimating the weights that minimize a loss function involving the forecast error.

Before going into the details of some of the steps, three remarks are in order:

  1. 1.

    To bring out the essential mechanisms, we have idealized some elements in the above discussion by omitting certain unavoidable discretization aspects. To start with, the ODE solutions cannot be computed exactly but only up to some accuracy given by a numerical integration scheme. In addition, the optimal control problems (5) and (12) are non-convex. As a result, in practice we can only find a local minima. Note however that modern solvers find solutions which are very satisfactory for all practical purposes. In addition, note that solving the control problem in a reduced space as in (12) could be interpreted as introducing a regularizing effect with respect to the control problem (5) in the full L([0,T])L^{\infty}([0,T]) space. It is to be expected that the search of global minimizers is facilitated in the reduced landscape.

  2. 2.

    routine-IR and routine-βγ{\bm{\beta}}{\bm{\gamma}}: A variant for the fitting problem (12) which we have studied in our numerical experiments is to replace the cost function 𝒥(𝜷,𝜸|Iobs,Robs,[0,T])\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\;|\;{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}},[0,T]) by the cost function

    𝒥~(𝜷,𝜸|𝜷obs,𝜸obs,[0,T])0T(|𝜷𝜷obs|2+|𝜸𝜸obs)|2)dt.\widetilde{\mathcal{J}}({\bm{\beta}},{\bm{\gamma}}\,|\,{\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}},[0,T])\coloneqq\int_{0}^{T}\left(|{\bm{\beta}}-{\bm{\beta}}^{*}_{\text{obs}}|^{2}+|{\bm{\gamma}}-{\bm{\gamma}}^{*}_{\text{obs}})|^{2}\right)\mathrm{d}t. (13)

    In other words, we use the variables 𝜷obs{\bm{\beta}}^{*}_{\text{obs}} and 𝜸obs{\bm{\gamma}}^{*}_{\text{obs}} from (6) as observed data instead of working with the observed health series Iobs{\textbf{I}}_{\text{obs}}, Robs{\textbf{R}}_{\text{obs}}. In Section 4, we will refer to the standard fitting method as routine-IR and to this variant as routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}}.

  3. 3.

    The fitting procedure works both on the components of the reduced basis and the initial time of the epidemics to minimize the loss function but, for simplicity, this last optimization is not reported in the notation.

2.3 Details on Step 1-(b): Collapsing the detailed models into SIR dynamics

Let

u(μ)=Detailed_Model(μ,[0,T+τ])L([0,T+τ],d){\textbf{u}}(\mu)=\texttt{Detailed\_Model}(\mu,[0,T+\tau])\;\in L^{\infty}([0,T+\tau],\mathbb{R}^{d})

be the solution in [0,T+τ][0,T+\tau] to a detailed model for a given vector of parameters μd\mu\in\mathbb{R}^{d}. Here dd is possibly large (d=11d=11 in the case of SEI55CHRD model and d=5d=5 in the case of SE22IUR’s one). The goal of this section is to explain how to collapse the detailed dynamics u(μ){\textbf{u}}(\mu) into a SIR dynamics with time-dependent parameters. The procedure can be understood as a projection of a detailed dynamics into the family of dynamics given by SIR models with time-dependent parameters.

For the SEI55CHRD model, we collapse the variables by setting

Scol\displaystyle{\textbf{S}}^{\text{col}} =S+E\displaystyle={\textbf{S}}+{\textbf{E}}
Icol\displaystyle{\textbf{I}}^{\text{col}} =Ip+Ia+Ips+Ims+Iss+C+H\displaystyle={\textbf{I${}_{p}$}}+{\textbf{I${}_{a}$}}+{\textbf{I${}_{ps}$}}+{\textbf{I${}_{ms}$}}+{\textbf{I${}_{ss}$}}+{\textbf{C}}+{\textbf{H}}
Rcol\displaystyle{\textbf{R}}^{\text{col}} =R+D\displaystyle={\textbf{R}}+{\textbf{D}}

Similarly, for the SE22IUR model, we set

Scol\displaystyle{\textbf{S}}^{\text{col}} =S+E1i\displaystyle={\textbf{S}}+{\textbf{E}}_{1i}
Icol\displaystyle{\textbf{I}}^{\text{col}} =E2i+Ii+Ui\displaystyle={\textbf{E}}_{2i}+{\textbf{I}}_{i}+{\textbf{U}}_{i}
Rcol\displaystyle{\textbf{R}}^{\text{col}} =R\displaystyle={\textbf{R}}

Note that Scol{\textbf{S}}^{\text{col}}, Icol{\textbf{I}}^{\text{col}} and Rcol{\textbf{R}}^{\text{col}} depend on μ\mu but we have omitted the dependence to lighten the notation.

Once the collapsed variables are obtained, we identify the time-dependent parameters 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} of the SIR model by solving the fitting problem

(𝜷,𝜸)arginf(𝜷,𝜸)L([0,T+τ],)×L([0,T+τ],)𝒥(𝜷,𝜸|Icol,Rcol,[0,T+τ])({\bm{\beta}},{\bm{\gamma}})\in\operatorname*{arg\,inf}_{({\bm{\beta}},{\bm{\gamma}})\in L^{\infty}([0,T+\tau],\mathbb{R})\times L^{\infty}([0,T+\tau],\mathbb{R})}\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\,|\,{\textbf{I}}^{\text{col}},{\textbf{R}}^{\text{col}},[0,T+\tau]) (14)

where

(S,I,R)=SIR(𝜷,𝜸,[0,T+τ]).({\textbf{S}},{\textbf{I}},{\textbf{R}})=\texttt{SIR}({\bm{\beta}},{\bm{\gamma}},[0,T+\tau]).

Note that problem (14) has the same structure as problem (5), the difference coming from the fact that the collapsed variables Icol{\textbf{I}}^{\text{col}}, Rcol{\textbf{R}}^{\text{col}} in (14) play the role of the health data Iobs{\textbf{I}}_{\text{obs}}, Robs{\textbf{R}}_{\text{obs}} in (5). Therefore, it follows from Proposition 2.1 that problem (14) has a very simple solution since it suffices to apply formula (6) for solving it. Note here that the exact derivatives of Scol{\textbf{S}}^{\text{col}}, Icol{\textbf{I}}^{\text{col}} and Rcol{\textbf{R}}^{\text{col}} can be obtained from the Detailed_Model.

Since the solution (𝜷,𝜸)({\bm{\beta}},{\bm{\gamma}}) to (14) depends on the parameter μ\mu of the detailed model, repeating the above procedure for every detailed scenario u(μ){\textbf{u}}(\mu) for any μ𝒫tr\mu\in\mathcal{P}_{{\text{tr}}} yields the two families of time-dependent functions tr={𝜷(μ):μ𝒫tr}\mathcal{B}_{\text{tr}}=\{{\bm{\beta}}(\mu)\;:\;\mu\in\mathcal{P}_{\text{tr}}\} and 𝒢tr={𝜸(μ):μ𝒫tr}\mathcal{G}_{\text{tr}}=\{{\bm{\gamma}}(\mu)\;:\;\mu\in\mathcal{P}_{\text{tr}}\} defined on the interval [0,T+τ][0,T+\tau] which were introduced in section (11).

2.4 Details on Model Order Reduction

Model Order Reduction is a family of methods aiming at approximating a set of solutions of parametrized PDEs or ODEs (or related quantities) with linear spaces, which are called reduced models or reduced spaces. In our case, the sets to approximate are

={𝜷(μ):μ𝒫}and𝒢={𝜸(μ):μ𝒫},\mathcal{B}=\{{\bm{\beta}}(\mu)\,:\,\mu\in\mathcal{P}\}\quad\text{and}\quad\mathcal{G}=\{{\bm{\gamma}}(\mu)\,:\,\mu\in\mathcal{P}\},

where each μ\mu is the vector of parameters of the detailed model which take values over 𝒫\mathcal{P}, and 𝜷(μ){\bm{\beta}}(\mu) and 𝜸(μ){\bm{\gamma}}(\mu) are the associated time-dependent coefficients of the collapsed SIR evolution. In the following, we view \mathcal{B} and 𝒢\mathcal{G} as subsets of L2([0,T])L^{2}([0,T]), and we denote by \|\cdot\| and ,\left<\cdot,\cdot\right> its norm and inner product. Indeed, in view of Proposition 2.1, \mathcal{B} and 𝒢L([0,T])L2([0,T])\mathcal{G}\subset L^{\infty}([0,T])\subset L^{2}([0,T]).

Let us carry the discussion for \mathcal{B} (the same will hold for 𝒢\mathcal{G}). If we measure performance in terms of the worst error in the set \mathcal{B}, the best possible performance that reduced models of dimension nn can achieve is given by the Kolmogorov nn-width

dn()L2([0,T])infYL2([0,T])dim(Y)nmaxu||uPYud_{n}(\mathcal{B})_{L^{2}([0,T])}\coloneqq\inf_{\begin{subarray}{c}Y\in L^{2}([0,T])\\ \dim(Y)\leq n\end{subarray}}\max_{u\in\mathcal{B}}||u-P_{Y}u\|

where PYP_{Y} is the orthogonal projection onto the subspace YY. In the case of measuring errors in an average sense, the benchmark is given by

δn2(,ν)L2([0,T])infYL2([0,T])dim(Y)=n𝒫||u(y)PYu(y)2dν(y)\delta^{2}_{n}(\mathcal{B},\nu)_{L^{2}([0,T])}\coloneqq\inf_{\begin{subarray}{c}Y\in L^{2}([0,T])\\ \dim(Y)=n\end{subarray}}\int_{\mathcal{P}}||u(y)-P_{Y}u(y)\|^{2}\mathrm{d}\nu(y)

where ν\nu is some given measure on 𝒫\mathcal{P}.

In practice, building spaces that meet these benchmarks is generally not possible. However, it is possible to build sequences of spaces for which their error decay is comparable to the one given by (dn()L2([0,T]))n\left(d_{n}(\mathcal{B})_{L^{2}([0,T])}\right)_{n} or (δn()L2([0,T]))n\left(\delta_{n}(\mathcal{B})_{L^{2}([0,T])}\right)_{n}. As a result, when the Kolmogorov width decays fast, the constructed reduced spaces will deliver a very good approximation of the set \mathcal{B} with few modes (see [5, 9, 8, 20]).

We next present the reduced models that we have used in our numerical experiments. Other methods could of course be considered and we refer to [27, 4, 15, 21] for general references on model reduction. We carry the discussion in a fully discrete setting in order to simplify the presentation and keep it as close to the practical implementation as possible. All the claims below could be written in a fully continuous sense at the expense of introducing additional mathematical objects such as certain Hilbert-Schmidt operators to define the continuous version of the Singular Value Decomposition (SVD).

We build the reduced models using the two discrete training sets of functions tr={𝜷i}i=1K\mathcal{B}_{\text{tr}}=\{{\bm{\beta}}_{i}\}_{i=1}^{K} and 𝒢tr={𝜸i}i=1K\mathcal{G}_{\text{tr}}=\{{\bm{\gamma}}_{i}\}_{i=1}^{K} from \mathcal{B} and 𝒢\mathcal{G} where KK denotes the number of virtual scenarios considered. The sets have been generated in step 1-(b) of our general pipeline (see Section 2.2).

We consider a discretization of the time interval [0,T+τ][0,T+\tau] into a set of QQ\in\mathbb{N}^{*} points as follows {t1=0,,tP=T,,tQ=T+τ}\{t_{1}=0,\cdots,t_{P}=T,\cdots,t_{Q}=T+\tau\} where P<QP<Q. Thus, we can represent each function 𝜷i{\bm{\beta}}_{i} as a vector of QQ values

𝜷i=(βi(t1),,βi(tQ))T+Q.{\bm{\beta}}_{i}=(\beta_{i}(t_{1}),\cdots,\beta_{i}(t_{Q}))^{T}\in\mathbb{R}_{+}^{Q}.

and hence assemble all the functions of the family {𝜷i}i=1K\{{\bm{\beta}}_{i}\}_{i=1}^{K} into a matrix M+Q×KM_{\mathcal{B}}\in\mathbb{R}_{+}^{Q\times K}. The same remark applies for the family {𝜸i}i=1K\{{\bm{\gamma}}_{i}\}_{i=1}^{K} that gives a matrix M𝒢+Q×KM_{\mathcal{G}}\in\mathbb{R}_{+}^{Q\times K}.

  1. 1.

    SVD: The eigenvalue decomposition of the correlation matrix MTMK×KM_{\mathcal{B}}^{T}M_{\mathcal{B}}\in\mathbb{R}^{K\times K} writes

    MTM=VΛVT,M_{\mathcal{B}}^{T}M_{\mathcal{B}}=V\Lambda V^{T},

    where V=(vi,j)K×KV=(v_{i,j})\in\mathbb{R}^{K\times K} is an orthogonal matrix and ΛK×K\Lambda\in\mathbb{R}^{K\times K} is a diagonal matrix with non-negative entries which we denote λi\lambda_{i} and order them in decreasing order. The 2(Q)\ell^{2}(\mathbb{R}^{Q})-orthogonal basis functions {b1,,bK}\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{K}\} are then given by the linear combinations

    bi=j=1Kvj,i𝜷j,1iK.{\textbf{b}}_{i}=\sum_{j=1}^{K}v_{j,i}{\bm{\beta}}_{j},\quad 1\leq i\leq K.

    For nKn\leq K, the space

    Bn=span{b1,bn}\mathrm{B}_{n}=\mathrm{span}\{{\textbf{b}}_{1},\dots{\textbf{b}}_{n}\}

    is the best nn-dimensional space to approximate the set {𝜷i}i=1K\{{\bm{\beta}}_{i}\}_{i=1}^{K} in the average sense. We have

    δn({𝜷i}i=1K)2(Q+1)=(1Ki=1K||𝜷iPBn𝜷i2(Q+1)2)1/2=(i>nKλi)1/2\delta_{n}(\{{\bm{\beta}}_{i}\}_{i=1}^{K})_{\ell^{2}(\mathbb{R}^{Q+1})}=\left(\frac{1}{K}\sum_{i=1}^{K}||{\bm{\beta}}_{i}-P_{\mathrm{B}_{n}}{\bm{\beta}}_{i}\|_{\ell^{2}(\mathbb{R}^{Q+1})}^{2}\right)^{1/2}=\left(\sum_{i>n}^{K}\lambda_{i}\right)^{1/2}

    and the average approximation error is given by the sum of the tail of the eigenvalues.

    Therefore the SVD method is particularly efficient if there is a fast decay of the eigenvalues, meaning that the set tr={𝜷i}i=1K\mathcal{B}_{\text{tr}}=\{{\bm{\beta}}_{i}\}_{i=1}^{K} can be approximated by only few modes. However, note that by construction, this method does not ensure positivity in the sense that PBnβi(t)P_{\mathrm{B}_{n}}\beta_{i}(t) may become negative for some t[0,T]t\in[0,T] although the original function 𝜷i(t)0{\bm{\beta}}_{i}(t)\geq 0 for all t[0,T]t\in[0,T]. This is due to the fact that the vectors bi{\textbf{b}}_{i} are not necessarily nonnegative. As we will see later, in our study, ensuring positivity especially for extrapolation (i.e., forecasting) is particularly important, and motivates the next methods.

  2. 2.

    Nonnegative Matrix Factorization (NMF, see [25, 14]): NMF is a variant of SVD involving nonnegative modes and expansion coefficients. In this approach, we build a family of non-negative functions {bi}i=1n\{{\textbf{b}}_{i}\}_{i=1}^{n} and we approximate each 𝜷i{\bm{\beta}}_{i} with a linear combination

    𝜷iNMF=j=1nai,jbj,1iK,{\bm{\beta}}^{\texttt{NMF}}_{i}=\sum_{j=1}^{n}a_{i,j}{\textbf{b}}_{j},\quad 1\leq i\leq K, (15)

    where for every 1iK1\leq i\leq K and 1jn1\leq j\leq n, the coefficients ai,j0a_{i,j}\geq 0 and the basis function bj0{\textbf{b}}_{j}\geq 0. In other words, we solve the following constrained optimization problem

    (W,B)argmin(W,B)+K×n×+n×QMWBF2.{(W^{*},B^{*})\in\operatorname*{arg\,min}_{(W,B)\in\mathbb{R}_{+}^{K\times n}\times\mathbb{R}_{+}^{n\times Q}}\|M_{\mathcal{B}}-WB\|_{F}^{2}.}

    We refer to [14] for further details on the NMF and its numerical aspects.

  3. 3.

    Enlarged Nonnegative Greedy (ENG): greedy algorithm with projection on an extended cone of positive functions: We now present our novel model reduction method, which is of interest in itself since it allows to build reduced models that preserve positivity and even other types of bounds. The method stems from the observation that NMF approximates functions in the cone of positive functions of span{bi0}i=1n\mathrm{span}\{{\textbf{b}}_{i}\geq 0\}_{i=1}^{n} since it imposes that ai,j0a_{i,j}\geq 0 in the linear combination (15). However, note that the positivity of the linear combination is not equivalent to the positivity of the coefficients ai,ja_{i,j} since there are obviously linear combinations involving very small ai,j<0a_{i,j}<0 for some jj which may still deliver a nonnegative linear combination j=1nai,jbj\sum_{j=1}^{n}a_{i,j}{\textbf{b}}_{j}. We can thus widen the cone of linear combinations yielding positive values by carefully including these negative coefficients ai,ja_{i,j}. One possible strategy for this is proposed in Algorithm 1, which describes a routine that we call Enlarge_Cone. The routine

    {𝝍1,,𝝍n}=Enlarge_Cone[{b1,,bn},ε]\{{\bm{\psi}}_{1},\dots,{\bm{\psi}}_{n}\}=\texttt{Enlarge\_Cone}[\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{n}\},\varepsilon]

    takes a set of nonnegative functions {b1,,bn}\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{n}\} as input, and modifies each function bi{\textbf{b}}_{i} by iteratively adding negative linear combinations of the other basis functions bj{\textbf{b}}_{j} for jij\neq i (see line 11 of the routine). The coefficients are chosen in an optimal way so as to maintain the positivity of the final linear combination while minimizing the LL^{\infty}-norm. The algorithm returns a set of functions

    𝝍i=bijiσjibj,i{1,,n}{\bm{\psi}}_{i}={\textbf{b}}_{i}-\sum_{j\neq i}\sigma^{i}_{j}{\textbf{b}}_{j},\quad\forall i\in\{1,\dots,n\}

    with σji0\sigma^{i}_{j}\geq 0. Note that the algorithm requires to set a tolerance parameter ε>0\varepsilon>0 for the computation of the σji\sigma^{i}_{j}.

    Once we have run Enlarge_Cone, the approximation of any function 𝜷{\bm{\beta}} is then sought as

    𝜷(EC)=argminc1,,cn0𝜷j=1ncj𝝍jL2([0,T+τ])2{\bm{\beta}}^{(EC)}=\operatorname*{arg\,min}_{c_{1},\dots,c_{n}\geq 0}\|{\bm{\beta}}-\sum_{j=1}^{n}c_{j}{\bm{\psi}}_{j}\|^{2}_{L^{2}([0,T+\tau])} (16)

    We emphasize that the routine is valid for any set of nonnegative input functions. We can thus apply Enlarge_Cone to the functions {bi0}i=1n\{{\textbf{b}}_{i}\geq 0\}_{i=1}^{n} from NMF, but also to the functions selected by a greedy algorithm such as the following:

    • For n=1n=1, find

      b1=argmax𝜷{𝜷i}i=1K𝜷L2([0,T+τ])2{\textbf{b}}_{1}=\operatorname*{arg\,max}_{{\bm{\beta}}\in\{{\bm{\beta}}_{i}\}_{i=1}^{K}}\|{\bm{\beta}}\|^{2}_{L^{2}([0,T+\tau])}
    • At step n>1n>1, we have selected the set of functions {b1,,bn1}\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{n-1}\}. We next find

      bn=argmax𝜷{𝜷i}i=1Kminc1,,cn0𝜷j=1n1cjbjL2([0,T+τ])2{\textbf{b}}_{n}=\operatorname*{arg\,max}_{{\bm{\beta}}\in\{{\bm{\beta}}_{i}\}_{i=1}^{K}}\min_{c_{1},\dots,c_{n}\geq 0}\|{\bm{\beta}}-\sum_{j=1}^{n-1}c_{j}{\textbf{b}}_{j}\|^{2}_{L^{2}([0,T+\tau])}

    In our numerical tests, we call Enlarged Nonnegative Greedy (ENG) the routine involving the above greedy algorithm combined with our routine Enlarge_Cone.

    Set of nonnegative functions {b1,,bn}\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{n}\}. Tolerance ε>0\varepsilon>0.
    for i{1,,n}i\in\{1,\dots,n\} do
       Set σji=0,ji.\sigma^{i}_{j}=0,\quad\forall j\neq i.
       for {1,,n}\ell\in\{1,\dots,n\} do
          αi,=argmax{α0:[bijiσjibjαb](t)>0,t[0,T+τ]}\alpha_{\ell}^{i,*}=\operatorname*{arg\,max}\{\alpha\geq 0\;:\;\bigl{[}{\textbf{b}}_{i}-\sum_{j\neq i}\sigma^{i}_{j}{\textbf{b}}_{j}-\alpha{\textbf{b}}_{\ell}\bigr{]}(t)>0,\ \ \forall t\in[0,T+\tau]\}
          σi=σi+αi,2\sigma^{i}_{\ell}=\sigma^{i}_{\ell}+\frac{\alpha_{\ell}^{i,*}}{2}
          while αi,ε\alpha_{\ell}^{i,*}\geq\varepsilon  do
             αi,=argmax{α0:[bijiσjibjαb](t)>0,t[0,T+τ]}\alpha_{\ell}^{i,*}=\operatorname*{arg\,max}\{\alpha\geq 0\;:\;\bigl{[}{\textbf{b}}_{i}-\sum_{j\neq i}\sigma^{i}_{j}{\textbf{b}}_{j}-\alpha{\textbf{b}}_{\ell}\bigr{]}(t)>0,\ \ \forall t\in[0,T+\tau]\}
             σi=σi+αi,2\sigma^{i}_{\ell}=\sigma^{i}_{\ell}+\frac{\alpha_{\ell}^{i,*}}{2}
          end while
       end for
       𝝍i=bijiσjibj{\bm{\psi}}_{i}={\textbf{b}}_{i}-\sum_{j\neq i}\sigma^{i}_{j}{\textbf{b}}_{j}
    end for
    {𝝍1,,𝝍n}\{{\bm{\psi}}_{1},\dots,{\bm{\psi}}_{n}\}
    Algorithm 1 Enlarge_Cone[{b1,,bn},ε]{𝝍1,,𝝍n}\texttt{Enlarge\_Cone}[\{{\textbf{b}}_{1},\dots,{\textbf{b}}_{n}\},\varepsilon]\to\{{\bm{\psi}}_{1},\dots,{\bm{\psi}}_{n}\}

    We remark that if we work with positive functions that are upper bounded by a constant L>0L>0, we can ensure that the approximations, denoted as Ψ\Psi, and written as a linear combination of basis functions, will also be between these bounds 0 and LL by defining, on the one hand and as we have just done, a cone of positive functions generated by the above family {ψi}i\{\psi_{i}\}_{i}, and, on the other hand, by considering the base of the functions LφL-\varphi, φ\varphi being as above the set all greedy elements of the reduced basis to which we also apply an enlargement of these positive functions. We then impose that the approximation is written as a positive combination of the first (positive) functions and that LΨL-\Psi is also written as a combination with positive components in the second basis.

    In this frame, the approximation appears under the form of a least square approximation with 2n2n linear constraints on the nn coefficients expressing the fact that in the two above transformed basis the coefficients are nonnegative.

In addition to the previous basis functions, it is possible to include more general/generic basis functions such as polynomial, radial, or wavelet functions in order to guarantee simple forecasting trends. For instance, one can add affine functions in order to include the possibility of predicting with simple linear extrapolation to the range of possible forecasts offered by the reduced model. Given the overall exponential behavior of the health data, we have added an exponential function of the form b0(t)=ξexp(ξt)b_{0}(t)=\xi\exp(-\xi^{\prime}t) (respectively g0(t)=ψexp(ψt)g_{0}(t)=\psi\exp(-\psi^{\prime}t)) to the reduced basis functions {b1,,bn}\{b_{1},\dots,b_{n}\} and (respectively {g1,,gn}\{g_{1},\dots,g_{n}\}) where the real-valued nonnegative parameters ξ,ξ,ψ,ψ\xi,\xi^{\prime},\psi,\psi^{\prime} are obtained through a standard exponential regression from 𝜷obs{\bm{\beta}}_{\text{obs}}^{*} (respectively 𝜸obs{\bm{\gamma}}_{\text{obs}}^{*}) associated to the targeted profiles of infectious people, that is the profiles defined on the time interval [0,T][0,T] that one wants to extrapolate to ]T,τ]]T,\tau]. In other words, the final reduced models that we use for forecasting are:

Bn+1=span{b0,b1,,bn},Gn+1=span{g0,g1,,gn}L([0,T+τ],),\mathrm{B}_{n+1}=\mathrm{span}\{b_{0},b_{1},\dots,b_{n}\},\quad\mathrm{G}_{n+1}=\mathrm{span}\{g_{0},g_{1},\dots,g_{n}\}\subset L^{\infty}([0,T+\tau],\mathbb{R}),

Indeed, including the exponential functions in the reduced models gives access cheaply to the overall behavior of the parameters β\beta and γ\gamma, the rest of basis functions generated from the training sets catch the high order approximations and allow then to improve the extrapolation.

Remark 2.2.

Reduced models on ={I(μ):μ𝒫}\mathcal{I}=\{\textbf{I}(\mu)\,:\,\mu\in\mathcal{P}\} and ={R(μ):μ𝒫}\mathcal{R}=\{\textbf{R}(\mu)\,:\,\mu\in\mathcal{P}\} Instead of applying model reduction to the sets \mathcal{B} and 𝒢\mathcal{G} as we do in our approach, we could apply the same above techniques directly to the sets of solutions \mathcal{I} and \mathcal{R} of the SIR models with time-dependent coefficients in \mathcal{B} and 𝒢\mathcal{G}. In this case, the resulting approximation would however not follow a SIR dynamics.

3 Methodology for multiple regions including population mobility data

The forecasting method of Section 2.2 for one single region can be extended to the treatment of multiple regions involving population mobility. The prediction scheme will now be based on a multiregional SIR with time-dependent coefficients. Compared to other more detailed models, its main advantage is that it reduces drastically the number of parameters to be estimated. Indeed, detailed multiregional models such as multiregional extensions of the above SEI55CHRD and SE22IUR models from Section 2.3 require a number of parameters which quickly grows with the number PP of regions involved. Their calibration thus requires large amounts of data which, in addition, may be unknown, very uncertain, or not available. In a forthcoming paper, we will apply the fully multi-regional setting for the post-lockdown period.

The structure of this section is the same as the previous one for the case of a single region. We start by introducing in Section 3.1 the multi-regional SIR model with time-dependent coefficients and associated detailed models. As any multiregional model, mobility data are required as an input information, and the nature and level of detail of the available data motivates certain choices on the modelling of the multiregional SIR (as well as the other detailed models). We next present in Section 3.2 the general pipeline, in which we emphasize the high modularity of the approach.

3.1 Multi-regional compartmental models

In the spirit of fluid flow modeling, there are essentially two ways of describing mobility between regions:

  • In a Eulerian description, we take the regions as fixed references for which we record incoming and outgoing travels.

  • In a Lagrangian description, we follow the motion of people domiciled in a certain region and record their travels in the territory. We can expect this modeling to be more informative on the geographical spread of the disease but it comes at the cost of additional details on the people’s domicile region.

Note that both descriptions hold at any coarse or fine geographical level in the sense that what we call the regions could be taken to be full countries, departments within a country, or very small geographical partitions of a territory. We next describe the multi-regional SIR models with the Eulerian and Lagrangian description of population fluxes which will be the output of our methodology.

3.1.1 Multi-regional SIR models with time-dependent parameters

Eulerian description of population flux: Assume that we have PP regions and the number of people in region ii is NiN_{i} for i=1,,Pi=1,\dots,P. Due to mobility, the population in each region varies, so NiN_{i} depends on tt. However, the total population is assumed to be constant and equal to NN, that is

N=i=1PNi(t),t0.N=\sum_{i=1}^{P}N_{i}(t),\quad\forall t\geq 0.

For any t0t\geq 0, let λij(t)[0,1]\lambda_{i\to j}(t)\in[0,1] be the probability that people from ii travel to jj at time tt. In other words, λij(t)Ni(t)δt\lambda_{i\to j}(t)N_{i}(t)\delta t is the number of people from region ii that have travelled to region jj between time tt and t+δtt+\delta t. Note that we have

j=1Pλij(t)=1,t0.\sum_{j=1}^{P}\lambda_{i\to j}(t)=1,\quad\forall t\geq 0.

Since, for any δt0\delta t\geq 0,

Ni(t+δt)=Ni(t)jiλij(t)Ni(t)δt+jiλji(t)Nj(t)δtN_{i}(t+\delta t)=N_{i}(t)-\sum_{j\neq i}\lambda_{i\to j}(t)N_{i}(t)\delta t+\sum_{j\neq i}\lambda_{j\to i}(t)N_{j}(t)\delta t

dividing by δt\delta t and taking the limit δt0\delta t\to 0 yields

dNidt(t)=jiλij(t)Ni(t)+jiλji(t)Nj(t).\frac{\mathrm{d}N_{i}}{\mathrm{d}t}(t)=-\sum_{j\neq i}\lambda_{i\to j}(t)N_{i}(t)+\sum_{j\neq i}\lambda_{j\to i}(t)N_{j}(t).

Note that we have

i=1PdNidt(t)=0,t0.\sum_{i=1}^{P}\frac{\mathrm{d}N_{i}}{\mathrm{d}t}(t)=0,\quad\forall t\geq 0.

Thus iNi(t)=iNi(0)=N\sum_{i}N_{i}(t)=\sum_{i}N_{i}(0)=N, which is consistent with our assumption that the total population is constant.

The time evolution of the NiN_{i} is known in this case if we are given the λij(t)\lambda_{i\to j}(t) from Eulerian mobility data. In addition to this mobility data, we also have the data of the evolution of infected and removed people and our goal is to fit a multi-regional SIR model that is in accordance with this data. We propose the following model.

Denoting SiS_{i}, IiI_{i} an RiR_{i} the number of Susceptible, Infectious and Removed people in region ii at time tt, we first have the relation

Ni(t)=Si(t)+Ii(t)+Ri(t)1=Si(t)Ni(t)+Ii(t)Ni(t)+Ri(t)Ni(t).N_{i}(t)=S_{i}(t)+I_{i}(t)+R_{i}(t)\quad\Leftrightarrow\quad 1=\frac{S_{i}(t)}{N_{i}(t)}+\frac{I_{i}(t)}{N_{i}(t)}+\frac{R_{i}(t)}{N_{i}(t)}.

Note that from the second relation, it follows that

0=ddtSiNi+ddtIiNi+ddtRiNi.0=\frac{\mathrm{d}}{\mathrm{d}t}\frac{S_{i}}{N_{i}}+\frac{\mathrm{d}}{\mathrm{d}t}\frac{I_{i}}{N_{i}}+\frac{\mathrm{d}}{\mathrm{d}t}\frac{R_{i}}{N_{i}}. (17)

To model the evolution between compartments, one possibility is the following SIR model

ddtSiNi\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}\frac{S_{i}}{N_{i}} =(βiλiiIiNi+jiβjλjiIjNj)SiNi\displaystyle=-\left(\beta_{i}\lambda_{i\to i}\frac{I_{i}}{N_{i}}+\sum_{j\neq i}\beta_{j}\lambda_{j\to i}\frac{I_{j}}{N_{j}}\right)\frac{S_{i}}{N_{i}} (18)
ddtIiNi\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}\frac{I_{i}}{N_{i}} =ddtSiNiγiIiNi\displaystyle=-\frac{\mathrm{d}}{\mathrm{d}t}\frac{S_{i}}{N_{i}}-\gamma_{i}\frac{I_{i}}{N_{i}} (19)
ddtRiNi\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}\frac{R_{i}}{N_{i}} =γiIiNi,\displaystyle=\gamma_{i}\frac{I_{i}}{N_{i}}, (20)

The parameters βi\beta_{i}, γi\gamma_{i}, NiN_{i} depend on tt but we have omitted the dependence to ease the reading. Introducing the compartmental densities

si=SiNi,ii=IiNi,ri=RiNi,s_{i}=\frac{S_{i}}{N_{i}},\quad i_{i}=\frac{I_{i}}{N_{i}},\quad r_{i}=\frac{R_{i}}{N_{i}}, (21)

the system equivalently reads

ddtsi\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}s_{i} =(βiλiiii+jiβjλjiij)si\displaystyle=-\left(\beta_{i}\lambda_{i\to i}i_{i}+\sum_{j\neq i}\beta_{j}\lambda_{j\to i}i_{j}\right)s_{i} (22)
ddtii\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}i_{i} =ddtsiγiii\displaystyle=-\frac{\mathrm{d}}{\mathrm{d}t}s_{i}-\gamma_{i}i_{i} (23)
ddtri\displaystyle\frac{\mathrm{d}}{\mathrm{d}t}r_{i} =γiii,\displaystyle=\gamma_{i}i_{i}, (24)

Before going further, some comments are in order:

  • The model is consistent in the sense that it satisfies (17) and when P=1P=1 we recover the traditional SIR model.

  • Under lockdown measures, λijδi,j\lambda_{i\to j}\approx\delta_{i,j} and the population Ni(t)N_{i}(t) remains practically constant. As a result, the evolution of each region is decoupled from the others and each region can be addressed with the mono-regional approach.

  • The use of βj\beta_{j} in equation (22) is debatable. When the people from region jj arrive to region ii, it may be reasonable to assume that the contact rate is βi\beta_{i}.

  • The use of λji\lambda_{j\to i} in equation (22) is also very debatable. The probability λji\lambda_{j\to i} was originally defined to account for the mobility of people from region jj to region ii without specifying the compartment. However, in equation (22), we need the probability of mobility of infectious people from region jj to region ii, which we denote by μji\mu_{j\to i} in the following. It seems reasonable to think that μji\mu_{j\to i} may be smaller than λji\lambda_{j\to i} because as soon as people become symptomatic and suspect their illness, they will probably not move. Two possible options would be:

    • We could try to make a guess on μji\mu_{j\to i}. If the symptoms arise, say, 2 days after infection and if we recover in 15 days in average, then we could say that μji=2/15λji\mu_{j\to i}=2/15\lambda_{j\to i}.

    • Since, the above seems however pretty empirical, another option is to use λji\lambda_{j\to i} and absorb the uncertainty in the values of the βj\beta_{j} that one fits.

  • We choose not to add mobility in the RR compartment as this does not modify the dynamics of the epidemic spread and only adjustments in the population sizes are needed.

Lagrangian description of population flux: We call the above description Eulerian because we have fixed the regions as a fixed reference. Another point of view is to follow the trajectories of inhabitants of each region, in the same spirit as when we follow the trajectories of fluid particles.

Let now SiS_{i}, IiI_{i}, RiR_{i} the number of susceptible, infectious and Removed people who are domiciled in region i,i=1,Pi,i=1,\dots P. It is reasonable to assume that Si(t)+Ii(t)+Ri(t)S_{i}(t)+I_{i}(t)+R_{i}(t) is constant in time. However, all the dwellers of region ii may not all be in that region at time tt. Let λjk(i)(t)\lambda^{(i)}_{j\to k}(t) be the probability that susceptible people domiciled at ii travel from region jj to region kk at time tt. With this notation, λii(i)(t)\lambda^{(i)}_{i\to i}(t) is the probability that susceptible people domiciled at ii remain in region ii at time tt. Similarly, let μjk(i)(t)\mu^{(i)}_{j\to k}(t) be the probability that infectious people domiciled at ii travel from region jj to kk at time tt. Hence the total number of susceptible and infectious people that are in region ii at time tt is

𝒮i(t)\displaystyle\mathcal{S}_{i}(t) =k=1Pj=1P(λji(k)(t)λij(k)(t))Sk(t)\displaystyle=\sum_{k=1}^{P}\sum_{j=1}^{P}\left(\lambda^{(k)}_{j\to i}(t)-\lambda^{(k)}_{i\to j}(t)\right)S_{k}(t) (25)
i(t)\displaystyle\mathcal{I}_{i}(t) =k=1Pj=1P(μji(k)(t)μij(k)(t))Sk(t)\displaystyle=\sum_{k=1}^{P}\sum_{j=1}^{P}\left(\mu^{(k)}_{j\to i}(t)-\mu^{(k)}_{i\to j}(t)\right)S_{k}(t) (26)

We can thus write the evolution over SiS_{i}, IiI_{i}, RiR_{i} as

dSidt\displaystyle\frac{\mathrm{d}S_{i}}{\mathrm{d}t} =j=1Pk=1Pβk(t)λjk(i)(t)Si(t)k(t)\displaystyle=-\sum_{j=1}^{P}\sum_{k=1}^{P}\beta_{k}(t)\lambda^{(i)}_{j\to k}(t)S_{i}(t)\mathcal{I}_{k}(t) (27)
dIidt\displaystyle\frac{\mathrm{d}I_{i}}{\mathrm{d}t} =dSidtγi(t)Ii(t)\displaystyle=-\frac{\mathrm{d}S_{i}}{\mathrm{d}t}-\gamma_{i}(t)I_{i}(t) (28)
dRidt\displaystyle\frac{\mathrm{d}R_{i}}{\mathrm{d}t} =γi(t)Ii(t)\displaystyle=\gamma_{i}(t)I_{i}(t) (29)

Note that Si(t)+Ii(t)+Ri(t)S_{i}(t)+I_{i}(t)+R_{i}(t) is constant, which is consistent with the fact that in our model

ddt(Si+Ii+Ri)=0.\frac{\mathrm{d}}{\mathrm{d}t}(S_{i}+I_{i}+R_{i})=0.

We emphasize that, to implement this model, one needs the Lagrangian mobility data λjk(i)\lambda^{(i)}_{j\to k} for all (i,j,k){1,,P}3(i,j,k)\in\{1,\dots,P\}^{3}.
Notation: In the following, we gather the compartmental variables in vectors

S(S)i=1P,I(I)i=1P,R(R)i=1P\vec{{\textbf{S}}}\coloneqq({\textbf{S}})_{i=1}^{P},\quad\vec{{\textbf{I}}}\coloneqq({\textbf{I}})_{i=1}^{P},\quad\vec{{\textbf{R}}}\coloneqq({\textbf{R}})_{i=1}^{P}

as well as the time-dependent coefficients

𝜷=(𝜷)i=1P,𝜸=(𝜸)i=1P.\vec{{\bm{\beta}}}=({\bm{\beta}})_{i=1}^{P},\quad\vec{{\bm{\gamma}}}=({\bm{\gamma}})_{i=1}^{P}.

For any 𝜷\vec{{\bm{\beta}}} and 𝜸(L([0,T]))P\vec{{\bm{\gamma}}}\in\left(L^{\infty}([0,T])\right)^{P}, we denote by

(S,I,R)=Multiregional_SIR(S(0),I(0),R(0),𝜷,𝜸,[0,T])(\vec{{\textbf{S}}},\vec{{\textbf{I}}},\vec{{\textbf{R}}})=\texttt{Multiregional\_SIR}(\vec{{\textbf{S}}}(0),\vec{{\textbf{I}}}(0),\vec{{\textbf{R}}}(0),\vec{{\bm{\beta}}},\vec{{\bm{\gamma}}},[0,T])

the output of any of the above multiregional SIR models. For simplicity in what follows, we will omit the initial condition in the notation.

3.1.2 Detailed multi-regional models with constant coefficients

In the spirit of the multi-regional SIR, one can formulate detailed multi-regional versions of more detailed models such as the ones introduced in Section 2.1. We omit the details for the sake of brevity.

3.2 Forecasting for multiple regions with population mobility

Similarly as in the mono-regional case, we assume that we are given health data in [0,T][0,T] in all regions. The observed data in region ii is the series of infected people, denoted IiobsI_{i}^{\text{obs}}, and recovered people denoted RiobsR_{i}^{\text{obs}}. They are usually given at a national or a regional scale and on a daily basis.

We propose to fit the data and provide forecasts with SIR models with time-dependent parameters 𝜷i{\bm{\beta}}_{i} and 𝜸i{\bm{\gamma}}_{i} for each region ii. Like in the mono-regional case, we can prove that such a simple family possesses optimal fitting properties for our purposes. In the current case, the cost function reads

𝒥(𝜷,𝜸|Iobs,Robs,[0,T])i=1P0T(|Ii(t)Iiobs(t)|2+|Ri(t)Riobs(t)|2)dt\displaystyle{\mathcal{J}(\vec{{\bm{\beta}}},\vec{{\bm{\gamma}}}\,|\,\vec{{\textbf{I}}}_{\text{obs}},\vec{{\textbf{R}}}_{\text{obs}},[0,T])}\coloneqq\sum_{i=1}^{P}\int_{0}^{T}\left(|I_{i}(t)-I_{i}^{\text{obs}}(t)|^{2}+|R_{i}(t)-R_{i}^{\text{obs}}(t)|^{2}\right)\mathrm{d}t
such that (S,I,R)=Multiregional_SIR(𝜷,𝜸,[0,T]),\displaystyle\qquad\text{such that }\left(\vec{{\textbf{S}}},\vec{{\textbf{I}}},\vec{{\textbf{R}}}\right)=\texttt{Multiregional\_SIR}\left(\vec{{\bm{\beta}}},\vec{{\bm{\gamma}}},[0,T]\right),

and the fitting problem is the optimal control problem of finding

J=inf𝜷,𝜸(L([0,T]))P×(L([0,T]))P𝒥(𝜷,𝜸|Iobs,Robs,[0,T]).J^{*}=\inf_{\vec{{\bm{\beta}}},\vec{{\bm{\gamma}}}\in\left(L^{\infty}([0,T])\right)^{P}\times\left(L^{\infty}([0,T])\right)^{P}}{\mathcal{J}(\vec{{\bm{\beta}}},\vec{{\bm{\gamma}}}\,|\,\vec{{\textbf{I}}}_{\text{obs}},\vec{{\textbf{R}}}_{\text{obs}},[0,T]).} (30)

The following proposition ensures the existence of a unique minimizer under certain conditions. To prove it, it will be useful to remark that any of the above multi-regional SIR models (see (22), (27)) can be written in the general form

dSdt\displaystyle\frac{\mathrm{d}\vec{{\textbf{S}}}}{\mathrm{d}t} =M(Λ(t),S(t),I(t))𝜷\displaystyle=\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t))\vec{{\bm{\beta}}}
dIdt\displaystyle\frac{\mathrm{d}\vec{{\textbf{I}}}}{\mathrm{d}t} =dSdtdiag(I(t))𝜸\displaystyle=-\frac{\mathrm{d}\vec{{\textbf{S}}}}{\mathrm{d}t}-\operatorname{diag}({\textbf{I}}(t))\,\vec{{\bm{\gamma}}}
dRdt\displaystyle\frac{\mathrm{d}\vec{{\textbf{R}}}}{\mathrm{d}t} =diag(I(t))𝜸,\displaystyle=\operatorname{diag}({\textbf{I}}(t))\vec{{\bm{\gamma}}},

where, by a slight abuse of notation, the vectors S\vec{{\textbf{S}}}, I\vec{{\textbf{I}}} and R\vec{{\textbf{R}}} are densities of population in the case of the Eulerian approach (see equation (22)). They are classical population numbers in the case of the Lagrangian approach (see equation (27)). diag(I(t))\operatorname{diag}({\textbf{I}}(t)) is the P×PP\times P diagonal matrix with diagonal entries given by the vector I(t){\textbf{I}}(t). M(Λ(t),S(t),I(t))\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t)) is a matrix of size P×PP\times P which depends on the vectors of susceptible and infectious people S(t)\vec{{\textbf{S}}}(t), I(t)\vec{{\textbf{I}}}(t) and on the mobility matrix Λ\Lambda. In the case of the Eulerian description, Λ(t)=(λi,j(t))1i,jP\Lambda(t)=(\lambda_{i,j}(t))_{1\leq i,j\leq P} and in the case of the Lagrangian approach Λ(t)\Lambda(t) is the P×P×PP\times P\times P tensor Λ(t)=(λj,k(i)(t))1i,j,kP\Lambda(t)=(\lambda^{(i)}_{j,k}(t))_{1\leq i,j,k\leq P}. For example, in the case of the Eulerian model (27), the matrix MM reads

M(Λ(t),S(t),I(t))=diag(S(t))ΛTdiag(I(t))=(SiλijIj)1i,jP\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t))=-\operatorname{diag}(\vec{{\textbf{S}}}(t))\Lambda^{T}\operatorname{diag}(\vec{{\textbf{I}}}(t))=-(S_{i}\lambda_{i\to j}I_{j})_{1\leq i,j\leq P} (31)
Proposition 3.1.

If M(Λ(t),S(t),I(t))\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t)) is invertible for all t[0,T]t\in[0,T], then there exists a unique minimizer (𝛃,𝛄)(\vec{{\bm{\beta}}}^{*},\vec{{\bm{\gamma}}}^{*}) to problem (30).

Proof.

Since we assume that M(Λ(t),S(t),I(t))\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t)) is invertible for every t[0,T]t\in[0,T], we can set

{𝜷(t)M1(t)dSdt𝜸(t)diag1(I(t))dRdt\displaystyle\begin{cases}\vec{{\bm{\beta}}}^{*}(t)&\coloneqq M^{-1}(t)\frac{\mathrm{d}\vec{{\textbf{S}}}}{\mathrm{d}t}\\ \vec{{\bm{\gamma}}}^{*}(t)&\coloneqq\operatorname{diag}^{-1}(I(t))\frac{\mathrm{d}\vec{{\textbf{R}}}}{\mathrm{d}t}\end{cases} (32)

or equivalently

{𝜷(t)M1(t)dSdt𝜸(t)diag1(I(t))(dIdt+M(Λ(t),S(t),I(t))𝜷)\displaystyle\begin{cases}\vec{{\bm{\beta}}}^{*}(t)&\coloneqq M^{-1}(t)\frac{\mathrm{d}\vec{{\textbf{S}}}}{\mathrm{d}t}\\ \vec{{\bm{\gamma}}}^{*}(t)&\coloneqq-\operatorname{diag}^{-1}(I(t))\left(\frac{\mathrm{d}\vec{{\textbf{I}}}}{\mathrm{d}t}+\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t))\vec{{\bm{\beta}}}^{*}\right)\end{cases} (33)

so that

(Sobs,Iobs,Robs)=Multiregional_SIR(𝜷,𝜸,[0,T])(\vec{{\textbf{S}}}_{\text{obs}},\vec{{\textbf{I}}}_{\text{obs}},\vec{{\textbf{R}}}_{\text{obs}})=\texttt{Multiregional\_SIR}\left(\vec{{\bm{\beta}}}^{*},\vec{{\bm{\gamma}}}^{*},[0,T]\right)

and

𝒥(𝜷,𝜸|Iobs,Robs,[0,T])=0\mathcal{J}(\vec{{\bm{\beta}}}^{*},\vec{{\bm{\gamma}}}^{*}\,|\,\vec{{\textbf{I}}}_{\text{obs}},\vec{{\textbf{R}}}_{\text{obs}},[0,T])=0

which implies that J=0J^{*}=0. ∎

Before going further, let us comment on the invertibility of M(Λ(t),S(t),I(t))\mathrm{M}(\Lambda(t),\vec{{\textbf{S}}}(t),\vec{{\textbf{I}}}(t)) which is necessary in Proposition 3.1. A sufficient condition to ensure it is if the matrix is diagonally dominant row-wise or column-wise. This yields certain conditions on the mobility matrix Λ(t)\Lambda(t) with respect to the values of S(t)\vec{{\textbf{S}}}(t), I(t)\vec{{\textbf{I}}}(t). For example, if MM is defined as in equation (31), the matrix is diagonally dominant per rows if for every 1iP1\leq i\leq P,

λii>jiλijIjIi.\lambda_{i\to i}>\sum_{j\neq i}\lambda_{i\to j}\frac{I_{j}}{I_{i}}.

Similarly, if for every 1jP1\leq j\leq P,

λjj>ijλijSiSj,\lambda_{j\to j}>\sum_{i\neq j}\lambda_{i\to j}\frac{S_{i}}{S_{j}},

then the matrix is diagonally dominant per columns, and guarantees invertibility. Note that any of the above conditions is satisfied in situations with little or no mobility where λiiδi,j\lambda_{i\to i}\approx\delta_{i,j}.

Now that we have exactly defined the set up for the multi-regional case, we can follow the same steps of Section 2.2 to derive forecasts involving model reduction for the time-dependent variables 𝜷\vec{{\bm{\beta}}} and 𝜸\vec{{\bm{\gamma}}}.

4 Numerical results

In this section we apply our forecasting method to the ongoing COVID-19 pandemic spread in year 2020 in France which started approximately from February 2020. Particular emphasis is put on the first pandemic wave for which we consider the period going from March 19 to May 20, 2020. Due to the lockdown imposed between March 17 and May 11, inter-regional population mobility was drastically reduced in that period. Studies using anonymised Facebook data have estimated the reduction in 80% (see [3]). As a result, it is reasonable to treat each region independently from the rest, and we apply the mono-regional setting of Section 2. Here, we focus on the case of the Paris region, and we report different forecasting errors obtained using the methods described in Section 2. Some forecasts are also shown for the second wave for the Paris region between September 24 and November 25.

The numerical results are presented as follows. Section 4.1 explains the sources of health data. Sections 4.2.1 and 4.2.2 go straight to the core of the results and present a study of the forecasting power of the methods in a two week horizon. Section 4.2.3 displays forecasts on the second wave. Section 4.2.4 aims to illustrate the robustness of the forecasting over longer periods of time. A discussion on the fitting errors of the methods is given in Appendix A. Additional results highlighting the accuracy of the forecasts are shown in Appendix B.

4.1 Data

We use public data from Santé Publique France111https://www.data.gouv.fr/en/datasets/donnees-hospitalieres-relatives-a-lepidemie-de-covid-19/ to get the number Iobs(t)I_{\text{obs}}(t) of infected, and Robs(t)R_{\text{obs}}(t) of removed people. As shown in Figure 3, the raw data present some oscillations at the scale of the week, which are due to administrative delays for the cases to be officially reported by hospitals. For our methodology, we have smoothed the data by applying a 7 days moving average filter. In order to account for the total number of infected people, we also multiply the data by an adjustment factor fadj=15f_{\rm adj}=15 as hinted from the literature222Indeed, it is said in [24] that ”of the 634 confirmed cases, a total of 306 and 328 were reported to be symptomatic and asymptomatic” and in [10] claims that the probability of developing severe symptoms for a symptomatic patient is 0 for children, 0.1 for adults and 0.2 for seniors. Thus if one takes p=0.13p=0.13 as an approximate value of these probabilities, one may estimate the adjustment factor as fadj=634328×1p15.f_{\rm adj}=\frac{634}{328}\times\frac{1}{p}\approx 15. . Obviously, this factor is uncertain and could be improved in the light of further retrospective studies of the outbreak. However, note that when SNS\simeq N which is the case in the start of the epidemic, the impact of this factor is negligible in the dynamics as can be understood from (6). In addition, since we use the same factor down to provide a forecast on the hospitalized, the influence on the choice is minor.

Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 3: Data from t0=19/03/2020t_{0}=19/03/2020 to T=20/05/2020T=20/05/2020

4.2 Results

Using the observations Iobs(t)I_{\text{obs}}(t) and Robs(t)R_{\text{obs}}(t), we apply a finite difference scheme in formula (6) to derive βobs(t)\beta^{*}_{\text{obs}}(t) and γobs(t)\gamma^{*}_{\text{obs}}(t) for t[0,T]t\in[0,T]. Figure 4 shows the values of these parameters as well as the basic reproduction number R0,obs=βobs/γobsR^{*}_{0,{\text{obs}}}=\beta_{\text{obs}}^{*}/\gamma_{\text{obs}}^{*} for the first pandemic wave in Paris.

Refer to caption
(a) βobs\beta_{\rm obs}^{*}
Refer to caption
(b) γobs\gamma_{\rm obs}^{*}
Refer to caption
(c) R0,obsR^{*}_{0,{\text{obs}}}
Figure 4: βobs\beta_{\text{obs}}^{*}, γobs\gamma_{\text{obs}}^{*}, R0,obs=βobs/γobsR^{*}_{0,{\text{obs}}}=\beta_{\text{obs}}^{*}/\gamma_{\text{obs}}^{*} deduced from the data from t0=19/03/2020t_{0}=19/03/2020 to T=20/05/2020T=20/05/2020

We next follow the steps of Section 2.2 to obtain the forecasts. In the learning phase (step 1), we use two parametric detailed models of type SE22IUR and SEI55CHRD to generate training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}} composed of K=2618K=2618 training functions 𝜷(μ){\bm{\beta}}(\mu) and 𝜸(μ){\bm{\gamma}}(\mu) where μ\mu are uniformly sampled in the set of parameters 𝒫tr\mathcal{P}_{\text{tr}} in the vicinity of the parameter values suggested in the literature [10, 22]. Based on these training sets, we finish step 1 by building three types of reduced models: SVD, NMF and ENG (see Section 2.4).

Given the reduced bases Bn\mathrm{B}_{n} and Gn\mathrm{G}_{n}, we next search for the optimal 𝜷nBn{\bm{\beta}}^{*}_{n}\in\mathrm{B}_{n} and 𝜸nGn{\bm{\gamma}}^{*}_{n}\in\mathrm{G}_{n} that fit at best the observations (step 2 of our procedure). For this fitting step we consider two loss functions:

  1. 1.

    routine-IR: loss function 𝒥(𝜷,𝜸|Iobs,Robs,[0,T])\mathcal{J}({\bm{\beta}},{\bm{\gamma}}\;|\;{\textbf{I}}_{\text{obs}},{\textbf{R}}_{\text{obs}},[0,T]) from (4),

  2. 2.

    routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}}: loss function 𝒥~(𝜷,𝜸|𝜷obs,𝜸obs,[0,T])\widetilde{\mathcal{J}}({\bm{\beta}},{\bm{\gamma}}\,|\,{\bm{\beta}}^{*}_{\text{obs}},{\bm{\gamma}}^{*}_{\text{obs}},[0,T]) from (13)

We study the performance of each of the three reduced models and the impact of the dimension nn of the reduced model in terms of fitting error. The presentation of these results is deferred to Appendix A in order not to overload the main discussion. The main conclusion is that the fitting strategy using SVD reduced bases provide smaller errors than NMF and ENG especially when we increase the number of modes nn. This is illustrated in Figure 5 where we show the fittings obtained with routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} and n=10n=10 for the first wave (from t0=19/03/2020t_{0}=19/03/2020 to T=20/05/2020T=20/05/2020). We observe that SVD is the best at fitting βobs\beta^{*}_{obs} and γobs\gamma^{*}_{obs} while ENG produces a smoother fitting of the data. Although the smoother fitting with ENG yields larger fitting errors than SVD, we will see in the next section that it yields better forecasts.

Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Figure 5: Fitting from t0=19/03/2020t_{0}=19/03/2020 to T=20/05/2020T=20/05/2020

4.2.1 Forecasting for the first pandemic wave on a 14 day horizon

In this section we illustrate the short-term forecasting behavior of our method. We consider a forecasting window of τ=14\tau=14 days and we examine several different starting days in the course of the first pandemic wave. The results are shown in figures 6 to 14. Recall that that the forecasting uses the coefficients of the reduced basis obtained by the fitting procedure, but also the optimal initial condition of the forecast that minimizes the error on the three days prior to the start of the forecast. For each given fitting strategy (routine-IR, routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}}), and each given type of reduced model (SVD, NMF, ENG), we have chosen to plot an average forecast computed with predictions using reduced dimensions n{5,6,7,8,9,10}n\in\{5,6,7,8,9,10\}. This choice is a simple type of forecast combination but of course other more elaborate aggregation options could be considered. The labels of the plots correspond to the following:

  • ISVDI_{SVD}, INMFI_{NMF}, IENGI_{ENG}, RSVDR_{SVD}, RNMFR_{NMF}, RENGR_{ENG} are the average forecasts obtained using routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}}.

  • ISVDI^{*}_{SVD}, INMFI^{*}_{NMF}, IENGI^{*}_{ENG}, RSVDR^{*}_{SVD}, RNMFR^{*}_{NMF}, RENGR^{*}_{ENG} are the average forecasts obtained using routine-IR.

The main observation from Figures 6 to 14 is that the ENG reduced model is the most robust and accurate forecasting method. Fitting ENG with routine-IR or routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} does not seem to lead to large differences in the quality of the forecasts but routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} seems to provide slightly better results. This claim is further confirmed by the study of the numerical forecasting errors of the different methods that the reader finds in Appendix B.

Figures 6 to 14 also show that the SVD reduced model is very unstable and provides forecasts that blow up. This behavior illustrates the dangers of overfitting, namely that a method with high fitting power may present poor forecasting power due to instabilities. In the case of SVD, the instabilities stem from the fact that approximations are allowed to take negative values. This is the reason why NMF, which incorporates the nonnegative constraint, performs better than SVD. One of the reasons why ENG outperforms NMF relies in the enlargement of the cone of nonnegative functions (see Section 2.4). It is important to note that, with ENG the reduced bases are directly related to well chosen virtual scenarios, whereas SVD and NMF rely on matrix factorization techniques that provide purely artificial bases. This makes forecasts from ENG more realistic and therefore more reliable.

Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 6: 14 day forecasts starting from T=01/04T=01/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 7: 14 day forecasts starting from T=03/04T=03/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 8: 14 day forecasts starting from T=05/04T=05/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 9: 14 day forecasts starting from T=07/04T=07/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 10: 14 day forecasts starting from T=09/04T=09/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 11: 14 day forecasts starting from T=11/04T=11/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 12: 14 day forecasts starting from T=15/04T=15/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 13: 14 day forecasts starting from T=21/04T=21/04.
Refer to caption
(a) Infected
Refer to caption
(b) Removed
Figure 14: 14 day forecasts starting from T=05/05T=05/05.

4.2.2 Focus on the forecasting with ENG

For our best forecasting method (routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} using ENG), we plot in Figures 15 to 23 the forecasts for each dimension n=5n=5 to 1010. The plots give the forecasts on a 14 day ahead window for β\beta, γ\gamma and the resulting evolution of the infected II and removed RR. We see that the method performs reasonably well for all values of nn, and proves that the results of the previous section with the averaged forecast are not compensating spurious effects which could occur for certain values of nn. We intentionally chose to display the inaccurate forecasts from 03/04, 07/04 and 11/04 as they are among of the worst predictions obtained using this method, however, it is important to mention that despite the lack of accuracy for these cases it remains plausible epidemic behaviors with different but realistic evolutions for β\beta and γ\gamma compared to the actual evolution. Note that the method was able to predict the peak of the epidemic several days in advance (see Figure 15). We also observe that the prediction on γ\gamma is difficult at all times due to the fact that γobs\gamma^{*}_{\text{obs}} presents an oscillatory behavior. Despite this difficulty, the resulting forecasts for II and RR are very satisfactory in general.

Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 15: ENG forecast from T=01/04T=01/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 16: ENG forecast from T=03/04T=03/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 17: ENG forecast from T=05/04T=05/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 18: ENG forecast from T=07/04T=07/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 19: ENG forecast from T=09/04T=09/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 20: ENG forecast from T=11/04T=11/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 21: ENG forecast from T=15/04T=15/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 22: ENG forecast from T=21/04T=21/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 23: ENG forecast from T=05/05T=05/05

4.2.3 Forecasting of the second wave with ENG

The review of the present paper took place during the month of November 2020 while the second COVID-19 pandemic wave hit France. We have taken the chance to enlarge the body of numerical results and we provide some example forecasts with ENG on this wave in figures 24 to 26. As the figures illustrate, the method provides very satisfactory forecasts on a 14 day ahead window. We again observe a satisfactory prediction of the second peak (figures 24, 25 and 26) and the same difficulty in forecasting γ\gamma due to the oscillations in γobs\gamma_{\text{obs}} but this has not greatly impacted on the quality of the forecasts for II and RR.

Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 24: ENG forecast from T=28/10T=28/10
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 25: ENG forecast from T=03/11T=03/11
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 26: ENG forecast from T=09/11T=09/11

4.2.4 Forecasts with ENG on a 28-day-ahead window

To conclude this results section, we extend the forecasting window to 28 days instead of 14 and study whether the introduced ENG method still provides satisfactory forecasts. As shown in figures 27 to 32), the results of the methods are quite stable for large windows. This shows that in contrast to standard extrapolation methods using classical linear or affine regressions, the reduced basis catches the dynamics of 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} not only locally but also at extended time intervals.

Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 27: ENG forecast from T=01/04T=01/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 28: ENG forecast from T=05/04T=05/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 29: ENG forecast from T=05/04T=05/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 30: ENG forecast from T=15/04T=15/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 31: ENG forecast from T=21/04T=21/04
Refer to caption
(a) β\beta
Refer to caption
(b) γ\gamma
Refer to caption
(c) Infected
Refer to caption
(d) Removed
Figure 32: ENG forecast from T=28/10T=28/10

5 Conclusion

We have developed an epidemiological forecasting method based on reduced modeling techniques. Among the different strategies that have been explored, the one that outperforms the rest in terms of robustness and forecasting power involves reduced models that are built with an Enlarged Nonnegative Greedy (ENG) strategy. This method is novel and of interest in itself since it allows to build reduced models that preserve positivity and even other types of bounds. Despite that ENG does not have optimal fitting properties (i.e. interpolation properties), it is well suited for forecasting since, due to the preservation of the natural constraints of the coefficients, it generates realistic dynamics with few modes. The results have been presented for a mono-regional test case, and we plan to extend the present methodology to a multi-regional setting using mobility data.

Last but not least, we would like to emphasize that the developed strategy is very general, and could be applied to the forecasting of other types of dynamics. The specificity of each problem may however require adjustments in the reduced models. This is exemplified in the present problem through the construction of reduced models that preserve positivity.

Acknowledgments and disclosure of funding

We would like to thank our colleagues from the “Face au virus” initiative in PSL University: Jamal Atif, Laurent Massoulié, Olivier Cappé, and Akin Kazakcy. We also thank Gabriel Turinici for his feedback on the multiregional models. Part of this research has been supported by the Emergences project grant “Models and Measures” of the Paris city council, by the generous donation made available by Alkan together with the complementary funding given by the Sorbonne University Foundation. This research is done in the frame of the project Pandemia/Covidia and also the GIS Obepine, both projects aiming at a better understanding of the Covid-19 pandemic.

Appendix A Analysis of model error and observation noise

In this section we study the impact of observation noise and model error in the quality and behavior of the fitting step. The elements that impact the accuracy of our procedure are the following:

  1. 1.

    Observation noise: The observed health data Iobs{\textbf{I}}_{\text{obs}} and Robs{\textbf{R}}_{\text{obs}} presents several sources of noise: there may be some inaccuracies in the reporting of cases, and the data are then post-processed. This eventually yields to noisy time series Iobs{\textbf{I}}_{\text{obs}} and Robs{\textbf{R}}_{\text{obs}} for which it is difficult to estimate the uncertainty. These noisy data are then used to produce (through finite differences) βobs\beta^{*}_{\text{obs}} and γobs\gamma^{*}_{\text{obs}} which are in turn also noisy.

  2. 2.

    Model errors: Two types of model errors are identified

    1. (a)

      Intrinsic model error on \mathcal{B} and 𝒢\mathcal{G}: The families of detailed models that we use are rich but they may not cover all possible evolutions of Iobs{\textbf{I}}_{\text{obs}} and Robs{\textbf{R}}_{\text{obs}}. In other words, our manifolds \mathcal{B} and 𝒢\mathcal{G} may not perfectly cover the real epidemiological evolution. Such error motivated the introduction of the exponential functions b0b_{0} and g0g_{0} described in Section 2.4.

    2. (b)

      Sampling error of \mathcal{B} and 𝒢\mathcal{G}: The size of the training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}} and the dimension nn of the reduced models Bn\mathrm{B}_{n} and Gn\mathrm{G}_{n} also limit the approximation capabilities of the continuous target sets \mathcal{B} and 𝒢\mathcal{G}

In this section, we aim at disentangling these effects in order to give a better insight of the fitting error behavior on the real data.

A.1 Study of the impact of the sampling error: Fitting a virtual scenario

Our set of virtual epidemiological dynamics is 𝒰\mathcal{U}. After the collapsing step, the manifolds for 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} are \mathcal{B} and 𝒢\mathcal{G}. These sets are potentially infinite and we have used finite training subsets tr\mathcal{B}_{\text{tr}}\subseteq\mathcal{B} and 𝒢tr𝒢\mathcal{G}_{\text{tr}}\subseteq\mathcal{G} to build the reduced models Bn\mathrm{B}_{n} and Gn\mathrm{G}_{n}.

First we look at the eigenvalues for β\beta and γ\gamma when performing an SVD decomposition for the virtual scenarios. Figure 33 shows a rapid decay of the eigenvalues obtained by SVD decomposition, it shows that we can obtain a very good approximation of elements of tr\mathcal{B}_{\text{tr}}\subseteq\mathcal{B} and 𝒢tr𝒢\mathcal{G}_{\text{tr}}\subseteq\mathcal{G} with only a few modes.

Refer to caption
(a) Normalised eigenvalues for β\beta vs nn
Refer to caption
(b) Normalised eigenvalues for γ\gamma vs nn
Figure 33: Training step: Decay of SVD eigenvalues

Among the sources of noise and model error given at the beginning of Appendix A, we can study the impact of the sampling error (point 2-b) as follows. For this, we start by examining the fitting error on an interval of 4545 days for two functions 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} which belong to the manifold \mathcal{B} and 𝒢\mathcal{G} and are in the training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}}. This error will serve as a benchmark that we will use to compare fitting errors of functions that are not in the training sets.

Figure 34 shows relative fitting errors βnβobsL2([t0,T])/β¯obsL2([t0,T])\|\beta^{*}_{n}-\beta^{*}_{\text{obs}}\|_{L^{2}([t_{0},T])}/\|\bar{\beta}^{*}_{\text{obs}}\|_{L^{2}([t_{0},T])} and γnγobsL2([t0,T])/γ¯obsL2([t0,T])\|\gamma^{*}_{n}-\gamma^{*}_{\text{obs}}\|_{L^{2}([t_{0},T])}/\|\bar{\gamma}^{*}_{\text{obs}}\|_{L^{2}([t_{0},T])} using SVD, NMF and ENG reduced bases with n=2,,20n=2,\dots,20, where the notation x¯\bar{x} denotes the mean of the quantity xx over [0,T][0,T]. We observe that for SVD the fitting errors behave similarly as the error decay of the training step (Figure 33).

Refer to caption
(a) L2L^{2} relative error of β\beta vs nn
Refer to caption
(b) L2L^{2} relative error of γ\gamma vs nn
Figure 34: Study of sampling errors: Fitting errors of functions 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} from \mathcal{B} and 𝒢\mathcal{G} that belong to the training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}}.

Figure 35 shows the L1L^{1} and LL^{\infty} errors obtained after the propagation of the fittings of 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}. In both metrics, the error for I and R obtained using SVD is decreasing below 101210^{-12}. When the NMF and the ENG are used, the error decreases and stagnates at 10210^{-2} for both I and R.

Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 35: Study of sampling errors: Errors on I and R obtained by a SIR model using the fitted 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}.

Now we consider the fitting error for two functions 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} which belong to the manifold \mathcal{B} and 𝒢\mathcal{G} but which are not in the training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}}. Figure 36 shows the fitting error on the virtual scenario considered using SVD, NMF and ENG reduced bases for n=2,,20n=2,\dots,20. We note that the quality of the fittings by each method is very similar to that of Figure 35 where the functions were taken in the training sets. This illustrates that the sampling error is not playing a major role in our experiments.

Refer to caption
(a) L2L^{2} relative error of β\beta vs nn
Refer to caption
(b) L2L^{2} relative error of γ\gamma vs nn
Figure 36: Study of sampling errors: Fitting errors of functions 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} from \mathcal{B} and 𝒢\mathcal{G} but which do not belong to the training sets tr\mathcal{B}_{\text{tr}} and 𝒢tr\mathcal{G}_{\text{tr}}.

Figure 37 shows the L1L^{1} and LL^{\infty} errors obtained after the propagation of the fittings of 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}. In both metrics, the error for I and R obtained using SVD is decreasing to each 101410^{-14}. When the NMF and the ENG are used, the error decreases and stagnates respectively at 10310^{-3} and 10410^{-4} for both I and R.

Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 37: Study of sampling errors: Errors on I and R obtained by a SIR model using the fitted 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}.

A.2 Study of the impact of noisy data and intrinsic model error

To investigate the impact of noise in the observed data, we now add noise to the two previously chosen functions 𝜷{\bm{\beta}}\in\mathcal{B} and 𝜸𝒢{\bm{\gamma}}\in\mathcal{G}, and we study the fitting error on this noisy data. The level of the noise has been chosen to be of the same level as the one estimated for the real dynamics. In order to estimate the noise we performed a fitting of the real data using SVD reduced bases, the level of noise is defined as the difference between this fitting and the real data. This level of noise is then added to the virtual scenario considered here. Figure 38 shows the fitting error on 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} using SVD, NMF and ENG reduced bases for n=2,,20n=2,\dots,20.

Refer to caption
(a) L2L^{2} relative error of 𝜷{\bm{\beta}} vs nn
Refer to caption
(b) L2L^{2} relative error of 𝜸{\bm{\gamma}} vs nn
Figure 38: Fitting errors of 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} on noisy virtual scenario

We observe that the noise strongly deteriorates the fitting error obtained using NMF, and the error becomes oscillatory and very unstable. For ENG, the error remains low and consistently around 10210^{-2} for 𝜷{\bm{\beta}}. We observe the same behaviour for 𝜸{\bm{\gamma}} with instabilities arising for n>10n>10. For SVD the error is lower than in the ENG case and slowly decreases as nn increases. Figure 39 shows the L1L^{1} and LL^{\infty} errors obtained after the propagation of the fittings of 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}}. In line with the observations from Figure 38 the error obtained using NMF is very unstable. Using ENG, we observe a decay from n=2n=2 to n=7n=7 and oscillations that remain around 10210^{-2} for II and 10310^{-3} for RR. The decay observed for SVD is slow and steady, the error nearly reaches 10410^{-4} for II and 10510^{-5} for RR when n=20n=20.

Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 39: Fitting errors of II and RR on noisy data

Finally, it remains to add intrinsic model error on top of the previous sampling error and observation noise. If we do so, the main change is that the previous fitting error plots from Figure 38 have essentially the same behavior but the error values are increased depending on the degree of model inaccuracy. We have therefore disentangled all the effects of model error and noisy data, and all the observations from this section give thus a better insight of the fitting on the real data.

Figure 40 summarizes the fitting results for an example fitting period taken from t0=19/03t_{0}=19/03 to T=03/05T=03/05. Figures 40(a) and 40(b) show the fitting error on 𝜷{\bm{\beta}} and 𝜸{\bm{\gamma}} using SVD, NMF and ENG reduced bases for n=2,,20n=2,\dots,20. Figures 40(c) and 40(d) show the L1L^{1} and LL^{\infty} relative errors on In{\textbf{I}}_{n} and Rn{\textbf{R}}_{n} after the propagation of the fittings of βn\beta^{*}_{n} and γn\gamma^{*}_{n}.

Refer to caption
(a) L2L^{2} relative error of 𝜷{\bm{\beta}} vs nn
Refer to caption
(b) L2L^{2} relative error of 𝜸{\bm{\gamma}} vs nn
Refer to caption
(c) L1L^{1} relative error of II vs nn
Refer to caption
(d) LL^{\infty} relative error of RR vs nn
Figure 40: Fitting errors from t0=19/03/2020t_{0}=19/03/2020 to T=03/05/2020T=03/05/2020

From figures 40(a) and 40(b), we observe that the fitting error with SVD decreases at a moderate rate as the dimension nn of the reduced basis is increased. The error with NMF and ENG does not decrease and oscillates around certain constant error value of order 10110^{-1}. Note that this value is small and yields to small errors in the approximation of I and R as figures 40(c) and 40(d) illustrate.

Appendix B Study of forecasting errors

In this section we give a detailed study of the forecasting errors for each different fitting strategy (routine-IR, routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}}), reduced model (SVD, NMF, ENG) and starting date TT. We anticipate the main conclusion which we already announced in Section 4.2.1: ENG fitted with routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} outperforms the other reduced models and is the most robust and accurate reduced model to use in a forecasting strategy. Figure 41 shows the relative errors of a 14 days forecast from T=01/04T=01/04 for each forecasting method and each reduced basis. Similarly figures 42 to 49 show forecasts relative errors respectively from different times.

Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 41: Forecasting errors of II and RR (from T=01/04T=01/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 42: Forecasting errors of II and RR (from T=03/04T=03/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 43: Forecasting errors of II and RR (from T=05/04T=05/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 44: Forecasting errors of II and RR (from T=07/04T=07/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 45: Forecasting errors of II and RR (from T=09/04T=09/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 46: Forecasting errors of II and RR (from T=11/04T=11/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 47: Forecasting errors of II and RR (from T=15/04T=15/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 48: Forecasting errors of II and RR (from T=21/04T=21/04)
Refer to caption
(a) L1L^{1} relative error of II vs nn
Refer to caption
(b) LL^{\infty} relative error of RR vs nn
Figure 49: Forecasting errors of II and RR (from T=05/05T=05/05)

We observe that the quality of the forecast depends on the reduced basis but also strongly on the starting day TT from which the forecast is done. The forecasts using routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} with SVD and NMF are not accurate and most of the time exploding. When routine-IR is used with SVD and NMF, the forecasts are more robust as there is no explosion of the error observed. Reduced bases from ENG consistently lead to the the best forecast obtained using either routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} and routine-IR; by inspecting the error on figures 41 to 49 and the averaged forecasts obtained in Section 4.2.1 we conclude that routine-𝜷𝜸{\bm{\beta}}{\bm{\gamma}} with ENG reduced bases provide slightly better forecasts.

References

  • [1] Anastassopoulou, C., Russo, L., Tsakris, A. and Siettos, C. Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PloS one, 15(3), p. e0230405, 2020.
  • [2] Armocida, B., Formenti, B., Ussai, S., Palestra, F. and Missoni, E. The Italian health system and the COVID-19 challenge. The Lancet Public Health, 5(5), p. e253, 2020.
  • [3] Atif, J., Cappé, O., Kazakçi, A., Léo, Y., Massoulié, L. and Mula, O. Initiative face au virus Observations sur la mobilité pendant l’épidémie de Covid-19. Technical report, PSL University, May 2020. URL https://hal.archives-ouvertes.fr/hal-02921194.
  • [4] Benner, P., Cohen, A., Ohlberger, M. and Willcox, K. Model Reduction and Approximation: Theory and Algorithms, volume 15. SIAM, 2017.
  • [5] Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G. and Wojtaszczyk, P. Convergence Rates for Greedy Algorithms in Reduced Basis Methods. SIAM Journal on Mathematical Analysis, 43(3), pp. 1457–1472, 2011. URL http://dx.doi.org/10.1137/100795772.
  • [6] Brauer, F. Mathematical epidemiology: Past, present, and future. Infectious Disease Modelling, 2(2), pp. 113–127, 2017.
  • [7] Ceylan, Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Science of The Total Environment, p. 138817, 2020.
  • [8] Cohen, A. and DeVore, R. Kolmogorov widths under holomorphic mappings. IMA Journal of Numerical Analysis, 36(1), pp. 1–12, 2016. URL http://dx.doi.org/10.1093/imanum/dru066.
  • [9] DeVore, R., Petrova, G. and Wojtaszczyk, P. Greedy algorithms for reduced bases in Banach spaces. Constructive Approximation, 37(3), pp. 455–466, 2013.
  • [10] Di Domenico, L., Pullano, G., Sabbatini, C. E., Boëlle, P.-Y. and Colizza, V. Expected impact of lockdown in Île-de-France and possible exit strategies. medRxiv, 2020.
  • [11] Fang, X., Liu, W., Ai, J., He, M., Wu, Y., Shi, Y., Shen, W. and Bao, C. Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China. BMC Infectious Diseases, 20(1), pp. 1–8, 2020.
  • [12] Ferguson, N., Laydon, D., Nedjati Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bhatia, S., Boonyasiri, A., Cucunuba Perez, Z., Cuomo-Dannenburg, G. et al. Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. 2020. URL https://www.if.ufrgs.br/if/wp-content/uploads/Imperial-College-COVID19-NPI-modelling-16-03-2020.pdf.
  • [13] Flaxman, S., Mishra, S., Gandy, A., Unwin, H. J. T., Mellan, T. A., Coupland, H., Whittaker, C., Zhu, H., Berah, T., Eaton, J. W. et al. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature, 584(7820), pp. 257–261, 2020.
  • [14] Gillis, N. The why and how of nonnegative matrix factorization. Regularization, optimization, kernels, and support vector machines, 12(257), pp. 257–291, 2014.
  • [15] Hesthaven, J. S., Rozza, G. and Stamm, B. Certified reduced basis methods for parametrized partial differential equations, volume 590. Springer, 2016.
  • [16] Keeling, M. J. and Rohani, P. Modeling infectious diseases in humans and animals. Princeton University Press, 2011.
  • [17] Kermack, W. O., McKendrick, A. G. and Walker, G. T. A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 115(772), pp. 700–721, 1927. URL http://dx.doi.org/10.1098/rspa.1927.0118.
  • [18] Liu, Z., Magal, P., Seydi, O. and Webb, G. A COVID-19 epidemic model with latency period. Infectious Disease Modelling, 5, pp. 323–337, 2020.
  • [19] Liu, Z., Magal, P., Seydi, O. and Webb, G. Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. arXiv preprint arXiv:2002.12298, 2020.
  • [20] Maday, Y., Mula, O. and Turinici, G. Convergence analysis of the Generalized Empirical Interpolation Method. SIAM Journal on Numerical Analysis, 54(3), pp. 1713–1731, 2016. URL http://dx.doi.org/10.1137/140978843.
  • [21] Maday, Y. and Patera, A. Reduced basis methods. In P. Benner, S. Grivet-Talocia, A. Quarteroni, G. Rozza, W. Schilders and L. Silveira (editors), Model Order Reduction, chapter 4, pp. 139–179. De Gruyter, Oxford, 16 Dec. 2020.
  • [22] Magal, P. and Webb, G. Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany. Medrxiv, 2020. URL https://doi.org/10.1101/2020.03.21.20040154.
  • [23] Martcheva, M. An introduction to mathematical epidemiology, volume 61. Springer, 2015.
  • [24] Mizumoto, K., Kagaya, K., Zarebski, A. and Chowell, G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance, 25(10), p. 2000180, 2020.
  • [25] Paatero, P. and Tapper, U. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), pp. 111–126, 1994. URL https://doi.org/10.1002/env.3170050203.
  • [26] Poncela, P., Rodríguez, J., Sánchez-Mangas, R. and Senra, E. Forecast combination through dimension reduction techniques. International Journal of Forecasting, 27(2), pp. 224–237, April 2011. URL https://ideas.repec.org/a/eee/intfor/v27yi2p224-237.html.
  • [27] Quarteroni, A., Manzoni, A. and Negri, F. Reduced basis methods for partial differential equations: an introduction, volume 92. Springer, 2015.
  • [28] Roda, W. C., Varughese, M. B., Han, D. and Li, M. Y. Why is it difficult to accurately predict the COVID-19 epidemic? Infectious Disease Modelling, 2020.
  • [29] Roques, L., Klein, E. K., Papaïx, J., Sar, A. and Soubeyrand, S. Impact of Lockdown on the Epidemic Dynamics of COVID-19 in France. Frontiers in Medicine, 7, p. 274, 2020.