Time-continuous and time-discrete SIR models revisited: theory and applications

Since Kermack and McKendrick have introduced their famous epidemiological SIR model in 1927, mathematical epidemiology has grown as an interdisciplinary research discipline including knowledge from biology, computer science, or mathematics. Due to current threatening epidemics such as COVID-19, this interest is continuously rising. As our main goal, we establish an implicit time-discrete SIR (susceptible people–infectious people–recovered people) model. For this purpose, we first introduce its continuous variant with time-varying transmission and recovery rates and, as our first contribution, discuss thoroughly its properties. With respect to these results, we develop different possible time-discrete SIR models, we derive our implicit time-discrete SIR model in contrast to many other works which mainly investigate explicit time-discrete schemes and, as our main contribution, show unique solvability and further desirable properties compared to its continuous version. We thoroughly show that many of the desired properties of the time-continuous case are still valid in the time-discrete implicit case. Especially, we prove an upper error bound for our time-discrete implicit numerical scheme. Finally, we apply our proposed time-discrete SIR model to currently available data regarding the spread of COVID-19 in Germany and Iran.


Motivation
Since its outbreak in Wuhan (China) in December 2019, the quick spread of COVID-19 has threatened people worldwide. Politicians around the globe have to balance different interests and need to make tremendous decisions which impact our daily life. For these reasons, governments around the world heavily rely on scientific advice in the present situation. Thus, John Hopkins University has been collecting epidemiological data regarding COVID-19 from many countries during the last months [1,2]. Additionally, many biological and medical studies regarding different aspects of this new coronavirus have been rapidly appearing in scientific journals [3][4][5][6][7][8][9]. However, to estimate the impact of COVID-19, governments need forecasts from as many models as possible.
Kermack and McKendrick introduced the now well-known SIR model in one of mathematical epidemiology's most groundbreaking works in 1927 [10]. They assumed a fixed population size and divided this population into three different homogeneous groups of people, namely susceptible people, infectious people, and recovered people, excluding births, deaths, and deaths by disease from their model. Due to its success and simplicity, their work was reprinted in 1991 [11][12][13]. In upcoming decades, epidemiologists and mathematicians have developed many variants and extensions of this basic model by, for example, adding age or spatial structures [14][15][16][17][18][19][20].
After the outbreak of COVID-19, many scientists have recently published articles with emphasis on epidemic forecasts which strongly relate to mathematical models. Many approaches, mainly focusing on stochastic arguments, with respect to predicting forecasts of the total number of infected people have appeared during the last weeks [21][22][23][24][25][26] or in the past [27,28]. Recently, neural networks have been applied to forecasting [29] and, additionally, different authors such as Atangana, Baleanu, or Khan have used fractional differential equations in extended SIR-type models to investigate the spread of COVID-19 or mathematical biology in general [30][31][32][33][34].

Contributions
There are works with respect to SIR models [35][36][37] and their time-discrete versions [38]. However, one finds mainly explicit schemes with respect to time-discrete SIR models in the aforementioned works and references therein. For this reason, we summarize and extend some results on properties of the time-continuous classical SIR model, and we propose an implicit time-discrete version of this classical SIR model and prove that this timediscrete variant maintains many of time-continuous version's properties. Hence, the aim of our study is to propose a nonautonomous SIR model, analyze the properties of its timecontinuous formulation, and develop an implicit numerical solution algorithm where the main properties of the time-continuous variant are conserved. More precisely, our main contributions can be summarized as follows: 1) At first, we propose a modified time-continuous SIR model with time-varying transmission and recovery rates; 2) Secondly, we conclude well-posedness of our time-continuous problem formulation.
This includes global existence in time, global uniqueness in time, based on an inductive application of Banach's fixed point theorem, and continuous dependence on initial conditions and time-varying rates; 3) In case of the time-discrete implicit model, we provide unique solvability, monotonicity properties, and an upper error bound between the solution of the implicit time-discrete problem formulation and the solution of the time-continuous problem formulation; 4) Finally, we compare our model predictions with real-world data regarding the spread of COVID-19 from different countries. Data have been collected by John Hopkins University. Conclusively, we summarize our results and give a brief outlook on possible extensions.

Time-continuous SIR model
In this section, we portray the time-continuous SIR model and its properties.

Mathematical background material
Here, we recall Lipschitz continuity of a function on Euclidean spaces. Definition 1 ([39,Sect. 3.2]) Let d 1 , d 2 ∈ N. If S ⊂ R d 1 , a defined function F : S − → R d 2 is called Lipschitz continuous on S if there exists a nonnegative constant L ≥ 0 such that holds for all x, y ∈ S. Here, · denotes a suitable norm on the corresponding Euclidean space.
We shall call F locally Lipschitz continuous if for every point x 0 ∈ U there exists a neighborhood V of x 0 such that the restriction of F to V is Lipschitz continuous on V .
In a more general framework, we consider a nonlinear initial value problem ⎧ ⎨ ⎩ z (t) = G(t, z(t)), where we define our solution vector z(t) = (x 1 (t), . . . , x n (t)) T , our vectorial function G(t, z(t)) = (g 1 (t, z(t)), . . . , g n (t, z(t))) T , and our initial condition z 0 ∈ R n . To conclude global existence in time, we can apply the following theorem that is a direct consequence of Grönwall's lemma.
holds for all z(t) ∈ R n , then the solution of the initial value problem (2) exists for all time t ∈ R and, moreover, it holds for all t ∈ R.
To prove global uniqueness in time, we need Banach's fixed point theorem since fixed point theorems have been successfully applied as a versatile tool in different areas of mathematics [40,41].
Finally, since we want to provide continuous dependence on initial conditions and other data, we need the following inequality named after Gronwall and Bellman.
holds for all t ∈ I, then we obtain for all t ∈ I.

Continuous problem formulation
At first, let us state the model's assumptions [16,17]: 1) The population's size N is fixed over time, i.e., N(t) = N for all t ∈ [0, ∞); 2) We divide a population into three homogeneous subgroups, namely susceptible people (S), infectious people (I), and recovered people (R). We can clearly assign every individual to exactly one subgroup. Hence, we obtain for all t ∈ [0, ∞); 3) Additionally, no births and deaths occur; We briefly comment on our choice of time-varying transmission rates. The choice of timedependent transmission rates is possible because countermeasures such as lock-downs, social distancing, or other political actions such as curfews reduce possible contacts between susceptible and infectious people. In addition to that, time-varying recovery rates might also be an interesting choice due to different medical treatments over the course of new epidemics such as COVID-19.
Furthermore, we exclude age structures or spatial relationships from our time-continuous model [16,19]. For abbreviation, we write f (t) := df (t) dt . Our equations of the timecontinuous SIR model read as follows: with initial conditions S(0) = S 1 > 0, I(0) = I 1 > 0, and R(0) = R 1 ≥ 0. We portray a chart of the flow between the different three subgroups in Fig. 1. Obviously, the equation is valid, and hence the first assumption is fulfilled.

Lemma 1 Each solution function of (8) is bounded below by zero.
Proof Here, we split the proof into three parts. 1) We first consider S (t) = -α(t) · I(t)·S(t) N . Separation of variables leads to Integration yields and this implies Hence, it holds S(t) ≥ 0 for all t ≥ 0.
2) Let us continue with I (t) = α(t) · I(t)·S(t) N β(t) · I(t). Separation of variables gives us and, applying the same argument as in our first step, we conclude This yields I(t) ≥ 0 for all t ≥ 0.
3) Finally, since R (t) = β(t) · I(t) holds, we clearly obtain and R(t) ≥ 0 for all t ≥ 0 is valid. This completes our proof.
Since N = S(t) + I(t) + R(t) holds by our first assumption, we can finally state our boundedness theorem.

Theorem 4
For all solution functions of (8), we have:

Global existence in time
In contrast to many other works, we formulate a theorem regarding global existence in time of (8) based on Theorem 1. For abbreviation, we introduce the supremum norm on an arbitrary time interval [a, b] for a continuous function f : [a, b] − → R. A similar definition holds for vector-valued functions. In our case, we consider [0, ∞) in general. Now, we show that a possible solution to (8) exists for all times t ≥ 0.

Theorem 5
The nonlinear first order ordinary differential equation system (8) has at least one solution which exists for all t ≥ 0.
Proof By defining z(t) = (S(t), I(t), R(t)) T , we can set Clearly, G is Lipschitz continuous. By considering the supremum norm on our Euclidean space and applying the triangle inequality, we get from (9) by the boundedness of our solution functions and the boundedness of our timevarying transmission and recovery rates. Thus, all our assumptions of Theorem 1 are fulfilled and our proof is complete.

Global uniqueness in time
We present our proof of global uniqueness in time based on an inductive application of Banach's fixed point theorem, i.e., that the initial value problem (8) is uniquely solvable for all times t ≥ 0.

Theorem 6
The nonlinear first order ordinary differential equation system (8) has a unique solution that exists for all t ≥ 0.
Proof 1) Let us first consider the time interval [0, τ ] where we have to choose τ accordingly such that Banach's fixed point theorem is applicable.
2) We need one brief lemma. Let x 1 , x 2 , y 1 , y 2 ∈ R be arbitrary. By zero addition and application of the triangle inequality, we obtain by our inequality of the second step and application of the triangle inequality. 5) Furthermore, we conclude that holds. 6) Summarizing the previous steps, we obtain If we choose τ := 1 2·(2·α max +β max ) , this implies and hence we conclude the uniqueness of solution on the time interval [0, τ ]. 7) Inductively, we see that we can derive this contraction property on all time intervals [k · τ , (k + 1) · τ ] for all k ∈ N 0 by choosing k · τ as our starting point and for our initial conditions. Henceforth, our proof of global uniqueness in time is complete.

Continuous dependence on initial conditions and time-varying rates
Here, we consider the perturbed initial value problems with initial conditions S a (0) = S a,1 > 0, I a (0) = I a,1 > 0, R a (0) = R a,1 ≥ 0 and are different time-varying transmission and recovery rates. Now, we prove that small perturbations in initial conditions or small differences in time-varying transmission or recovery rates lead to small differences in the solutions on short time intervals [0, T]. This fact is summarized in the following theorem.
) T be the solutions of (10) and (11). Define the function and the constant We see that holds for arbitrary t ∈ [0, T] with given T ≥ 0.
Proof 1) Let us first mention that we often use the inequality for arbitrary x 1 , x 2 , y 1 , y 2 ∈ R as proven in Theorem 6. Additionally, we see that 2) At first, we obtain the inequality for arbitrary t ∈ [0, T] by application of the triangle inequality, boundedness of our all functions, and the inequality of our first step.
3) Secondly, we have to estimate |I a (t) -I b (t)|. We see that holds for arbitrary t ∈ [0, T]. The summand I can be estimated in the previous step. This For the third summand II, we observe that is valid for arbitrary t. Summarizing these results, we obtain

5) Finally, we obtain
Since all the assumptions of Theorem 3 are fulfilled, we see that holds for arbitrary t ∈ [0, T], which finishes our proof.

Monotonicity properties and long-time behavior
We now investigate the long-time behavior of solution, some monotonicity properties and summarize our results in the following theorem.

Theorem 8
We get: 1) S is monotonically decreasing, and there exists a number S ≥ 0 such that lim t→∞ S(t) = S . It even holds S > 0; 2) R is monotonically increasing, and there exists a number R ≥ 0 such that lim t→∞ R(t) = R ; 3) I is Lebesgue-integrable on [0, ∞) and lim t→∞ I(t) = 0 for all solution functions of (8).
is monotonically decreasing and bounded below by zero. This implies the existence of S ≥ 0 such that lim t→∞ S(t) = S . Additionally, by considering S (t) R (t) for t > 0, we obtain and separation of variables implies Integration yields is monotonically increasing and bounded above by N . This yields the are bounded and nonnegative. Therefore, we obtain that I is Lebesgue-integrable on [0, ∞) and lim t→∞ I(t) = 0. This finishes our proof.

Calculation of the time-continuous basic reproduction number
In our nonautonomous SIR model, the time-dependent basic reproduction number can be defined by which is similar to the constant case [17,45,46]. (13) is well defined.

Lemma 2 Equation
Proof We observe that is valid for all t ≥ 0. This proves our claim.

Time-discrete implicit SIR model
In this section, we examine time-discrete versions of the given time-continuous SIR model (8).

Discussion of formulations
Here, we only state a fully explicit scheme (14) and a fully implicit scheme of the time-continuous SIR model (8) for all j ∈ {1, . . . , M -1}. However, the fully explicit scheme (14) simply reduces to a linear system, while the fully implicit scheme (15) maintains the nonlinear structure of the continuous problem formulation (8). For this reason, we investigate this fully implicit scheme in the following.

Monotonicity properties and long-time behavior
We show that many of the continuous properties from Theorems 4 and 8 even translate to the time-discrete implicit scheme (17). (17), we have:

Theorem 10 For our time-discrete implicit solution scheme
Proof 1) It holds I j ≥ 0 due to (22) and I j ≤ N due to (16) for all j ∈ {1, . . . , M}.
2) By our first property and due to (16), we have the inequality 0 ≤ S j ≤ N for all j ∈ {1, . . . , M}. Again by our first property, we obtain 3) By our first property and due to (16), we obtain the inequality 0 ≤ R j ≤ N for all j ∈ {1, . . . , M}. Again by our first property, we conclude 4) Since {R j } j∈N is monotonically increasing and bounded above by the total population size N , there exists a nonnegative constant R such that lim j→∞ R j = R . Furthermore, it holds R j+1 -R j = β j+1 · j+1 · I j+1 , which yields This implies lim j→∞ I j = 0 and completes our assertion's proof.

Error analysis
Now, we want to provide an upper bound for error propagation. Before proving this statement, we need to formulate some assumptions for our convergence analysis. We summarize them in the following list: nonnegative constants α min , α max , β min , β max such that 0 < α min ≤ α(t) ≤ α max < 1 and 0 < β min ≤ β(t) ≤ β max < 1 hold for all t ∈ [0, T]; 6) Choose p < min{ 1 4·(α max +β max ) , 1} ≤ 1 for all p ∈ N and set := max p∈N p . Under these conditions, we obtain the following theorem where we adapt ideas from the error analysis of an explicit-implicit solution algorithm as presented in [20].

Theorem 11 If the aforementioned assumptions are fulfilled, the difference between the solution of the time-continuous problem formulation (8) and the solution of the time-discrete problem formulation (17) fulfills
Proof We briefly describe our strategy first because this proof is technical. We begin with an estimation of local errors between time-continuous and time-discrete solutions. Afterwards, we consider error propagation in time. Conclusively, we investigate the cumulation of these errors. Time-discrete solutions are written as S p at time t p and time-continuous solutions as S(t p ) at the same time. 1) For examination of local errors, we assume that Here, we consider solely one time step and denote corresponding time-discrete solutions by S p+1 , I p+1 , and R p+1 .

1.1) It first holds
, and this implies :=II S,1 by application of the triangle inequality. We estimate these terms separately. For I S,1 , we obtain and the mean value theorem of calculus implies the existence of ξ S,1 ∈ (t p , t p+1 ) such that holds. This yields For II S,1 , we see that :=III S,1 is valid by the definition of S (t), boundedness of our solution functions, and application of the triangle inequality. By plugging into III S,1 , we obtain by the boundedness of our solution functions and application of the triangle inequality. By the mean value theorem, there exists ξ α,1 ∈ [t p , t p+1 ] such that holds. Hence, an additional application of the triangle inequality and boundedness of the solution functions yields Plugging (25) into II S,1 , we conclude that holds. Combining (24) and (26), we infer that

1.2) We observe that
is valid. Hence, it follows by application of the triangle inequality and with similar arguments as provided in the previous step. As holds, we further obtain by the boundedness of the solution functions and application of the triangle inequality. By we obtain by the boundedness of the solution functions, the mean value theorem of calculus, and application of the triangle inequality. Additionally, it holds by application of the mean value theorem of calculus. Combining (28) and (29) and plugging these results into Plugging this inequality into

1.3) We see that
holds. This implies By applying R (t p ) = β p · I(t p ) and Plugging this inequality into 1.4) Define C loc := max{C loc,S ; C loc,I ; C loc,R }. It holds for local errors on time intervals [t p , t p+1 ].
2) In reality, (t p , S p ) T , (t p , I p ) T , and (t p , R p ) T do not exactly lie on the graph of the timecontinuous solution. Therefore, we must examine how procedural errors such as S p -S(t p ), I p -I(t p ) or R p -R(t p ) propagate to the (p + 1)th time step. These investigations are carried out in step 2) and in step 3). By (9), we see that G(t p+1 , z p+1 ) -G t p+1 , z(t p+1 ) holds, and this implies 1 1 -2 · (α max + β max ) · j = C loc · 2 · ( 1 1-2·(α max +β max )· ) p -1 ( which finishes our proof of (23).

Calculation of the time-discrete basic reproduction number
In our nonautonomous time-discrete SIR model, the time-dependent basic reproduction number can be defined by for arbitrary k ∈ {1, . . . , M}, which is similar to the case of constant transmission and recovery rates [17,45].

Lemma 3 Equation (35) is well defined.
Proof This proof is identical to Lemma 2.

Numerical algorithm
We are now able to give a brief description of our numerical algorithm to solve the timediscrete implicit solution scheme (17). Here, we summarize our inputs, our computational steps, and our algorithmic outputs. We sketch the resulting algorithm in Table 1.

Numerical examples with discussion
We apply our time-discrete implicit SIR solution scheme (22) from Table 1 to available data regarding the spread of COVID-19 in Germany and Iran from John Hopkins University [1,2]. These countries are chosen because they update confirmed, dead, and estimated recovered cases on a regular basis. In Table 2, we summarize projected population sizes for 2019 from the United Nations [47].  Step 1: Step 2: -ComputeI j+1 by (22), (21) and (20)   At first, we consider the example of Germany in detail. We thoroughly describe our approach to check our model's validity. We only give some simple parameter estimation techniques and vary user-chosen time-dependent parameter functions. Since inverse problems are an active field of research, we refer the readers to works by Bock and Schittkowski for more sophisticated parameter estimation techniques in dynamical systems [48,49]. Our work also focuses on usefulness in possibly describing real-world data. Afterwards, we state some computational results for data from Iran.

Description of our approach by the example of Germany
for all j ∈ {1, . . . , M}. The unprocessed and processed data for Germany are depicted in Figs. 2 and 3. On the one hand, it can be clearly seen in Fig. 2 that both sequences of cumulative infected and cumulative recovered people are monotonically increasing for our unprocessed data. On the other hand, we notice in Fig. 3 that the behavior changes for our processed data.

Calculation of time-varying transmission and recovery rates from real-world data
Here, we present an algorithm to calculate our time-varying transmission and recovery rates based on our numerical algorithm (15) for all j ∈ {1, . . . , M -1}. We rely on the first Summarizing our results, we obtain and and a short algorithmic summary can be found in Table 3.
The time-varying transmission rate from real-world data for Germany is presented in Fig. 4. Clearly, the transmission rate decreases due to countermeasures such as local lock- Step 1: Step 2: -F orallj ∈ {2, . . . , M}, compute α j and β j according to (37) and (38)  The time-varying recovery rate from real-world data for Germany is depicted in Fig. 5. At the early stage of an epidemic, there are possible just few recoveries, thus this rate is relatively small. After some time, this situation changes as more people defeat the disease and recover. The rate seems to be constant with heavy variations due to the test capacity. Additionally, there are unknown cases because these people might have a mild disease course.

Calculation of time-dependent basic reproduction number from real-world data
Now, the time-dependent basic reproduction number R 0 (t j ) is readily computed by (35). Our computational results from this approach are portrayed in Fig. 6. Since there are only few recovered people at the beginning of disease, our computations provide high numerical basic reproduction number. We observe that the computational basic reproduction number is monotonically decreasing in spring due to political countermeasures and so- cial distancing. However, the graph shows that the computation basic reproduction number rises in summer because contacts between people rose. Variations are seen for similar reasons as mentioned for our time-dependent transmission and recovery rates.

Parameter estimation-a simple least-squares approach
We only briefly sketch the parameter identification problem because it is an inverse problem [50,51]. A deep discussion is beyond the scope of this paper and would be a topic of Step 1: -Compute β by (43) Step 2: -Compute γ 2 by (45) Step 3: -Compute γ 1 by (44) Step 4: -Compute α 1 and α 2 according to transformation (41) Outputs: -Parameters α 1 , α 2 , and β for our parametric rates (39) and (40) By the Hölder inequality, we obtain with equality only in the case that all t j are equal. Since we have a strictly increasing time sequence, the second determinant of the upper-left sub-matrices is also positive. Finally, the determinant of the full matrix is positive as well. Hence, our cost function J is strictly convex. Conclusively, it possesses a unique local minimizer by [52, Theorem 2.4].
2) By strict convexity, we infer that the unique local minimizer is also the unique global minimizer of our cost function J by [52, Theorem 2.5].
We summarize our algorithmic approach for parameter estimation of our time-varying transmission and recovery rates in Table 4.

Results for our parameter estimation approach using German data for short-term
predictions In Fig. 7, we see that the assumption of exponentially decaying time-dependent transmission rates is acceptable at the beginning of spreading disease with respect to German data. Due to short-term prediction, we notice that the constant recovery rate is underestimated in Fig. 8. Conclusively, both assumptions seem to be acceptable at the first weeks of a spreading disease. Computational results for two models on the time interval [25,62] are depicted in Figs. 9-12. Figures 9-12 indicate that sensitivity of parameters is really an issue in epidemiological models. This is in accordance with Theorem 7. These results also imply that an exponentially decaying transmission rate is an acceptable choice at the beginning of spreading disease. Figures 4 and 5 indicate that constant transmission and recovery rates are reasonable assumption at later stages. Computational results can be found in Figs. 13 and 14. We also notice that the number of infected people rises in summer, possibly due to more social contacts. This could eventually be regarded as the beginning of a 'second wave' .

Results for our parametric approach using German data from May to September
For future investigations, it might be interesting to use more complex transmission rates or piecewise defined functions with switching points, see e.g. [49].  Recovery rates from real-world data and from parameter estimation for short-term prediction of German data with t 1 = 0 (1 March 2020) and t M = 62 (2 May 2020). The first estimated recovery rate reads β ≈ 0.04403 for the mean value on the full interval. The second estimated recovery rate reads β ≈ 0.063 as the mean value on the time interval [25,62] because the fluctuations in β arise while there is no recovery rate estimation possible for the first days

Computational short-time results for data from Iran
Real-world data from Iran and short-term computational results for COVID-19 data from Iran can be found in Figs. 15-23. Figure 17 again supports the assumption of an exponen-  These results indicate that time-dependent transmission rates are a necessary addition to the classical SIR model. Alternatively, models with fractional derivatives could be considered [53,54].

Conclusion and outlook
We established certain properties such as well-posedness of the solution of our timecontinuous SIR model in Sect. 2. Fortunately, we were able to transfer many properties of the time-continuous model to our time-discrete implicit SIR model in Sect. 3. These include unique solvability and monotonicity properties. In contrast to many other works  Computational results for our model of recovered people in Germany for user-chosen α 1 ≈ 0.040, α 2 ≈ 0.0, β ≈ 0.075 for short-time simulation on t 1 = 61 (1 May 2020) and t M = 184 (1 September 2020) with real-world data as initial conditions mentioned in Sect. 1, we avoid an explicit forward model, but we could transform our implicit scheme to an easily solvable scheme. Thus, this makes our proposed scheme an attractive first prediction choice. In addition to that, we showed that our numerical scheme possesses an upper error bound. Regarding our computational results, we see that our parametrization is an appropriate fit for first forecasts considering the first wave of a spreading virus. Since these transmission rates are monotonically decreasing, we, however, remark that we will need to use another parametrization if we want to model diseases with seasonal behavior [55]. Regarding our chosen examples, we get reasonable results. Additionally, we observe that our theoretical findings regarding monotonicity of recovered people from The-  As depicted in Sect. 4, the inverse problem definitely needs further investigation. This is a topic of its own interest [56][57][58][59] since we need tools from different mathematical disciplines. As future research directions, extensions to further epidemiological forward models should be considered as we surely need more tools to predict the impact of upcoming  epidemics. One can also consider delayed-differential or stochastic variants of our SIR model or modifications and extensions [23,27] because, from a biological point of view, we often have to face integration of incubation times in epidemic models.