Crunching the Numbers on COVID-19

Every mathematical model of the COVID-19 epidemic is wrong. There is so much uncertainty about the disease, in how it spreads, and how effective various preventative measures are, that any forecast should be taken with a boulder of salt.

Mathematical models are not crystal balls, but they can be useful tools for thinking about epidemics and for devising strategies to fight them. They are also pretty fun to play with and provide a welcome distraction from the looming spectre of despair.

With these modest goals in mind, I present below a few simple epidemic models.

I am very grateful to my colleague Dr. Amy Hurford, an expert on such models, whose feedback on this article has been enormously helpful.

The Exponential Model

The simplest model for thinking about epidemics goes as follows. If today there are I many people who are infected (capital I stands for infected), and on average each of those people infects b many people over the course of the day, then tomorrow there will be b x I many newly infected people.

Now suppose each infected person infects b people tomorrow too, and the next day, and the next day, and so on. This process describes “exponential growth”. Exponential growth has the characteristic property of doubling at a fixed rate. This doubling behaviour is nicely illustrated here.

For example, in many countries, in the absence of preventative measures, the number of confirmed COVID-19 cases seem to double every 3 days or so (which implies that b = 0.26, see footnote*). Since yesterday in NL there were 141 confirmed cases, exponential growth at this rate would mean

2 x 141 = 282 cases on April 2rd,

2 x 282 = 564 cases on April 5th,

2 x 564 = 1128 cases on April 8th,

and so on (to be clear: this is not a prediction, only an illustration of the model). This rapid doubling rate means that the epidemic can get out of hand very quickly, and is a big reason why this virus is so dangerous.

Exponential growth is considered a good model for epidemics in the early stages, before many people are infected and before preventative measures have been taken, but it doesn’t make sense in the long run because you eventually run out of people to infect. To understand how epidemics behave in the long run, you need a more sophisticated model.

Here is a handy app that measures the growth rate of infections for various countries and does short term forecasts using the exponential model.

Another property of exponential growth is that it looks like a straight line when plotted on a log scale graph like this one. The growth rate (or doubling time) can be read off from the slope of the line. If the slope of the curve declines over time, that means the growth rate is declining, which is what we are all hoping for. A nice introduction to log scale graphs can be found here.

Farr’s Law

Historically, one of the earliest models of epidemic disease was Farr’s Law introduced in the mid-19th century. Farr’s Law is designed so that the number of new infections per day rises and falls following a bell curve.

Farr’s Law is not often used by epidemiologists today because it is simplistic and is considered unreliable for making long term predictions. However it can be a useful rule of thumb for modelling outbreaks of new diseases that are not yet well understood. Here is an example of such a projection for the COVID-19 epidemic in Ontario.

The SIR Model

This model tracks three different subsets of the population:

S is the number of susceptible people,
I is the number of infected people, and
R is the number of recovered (or dead) people.

Adding these numbers together gives the total population N:

N = S+I+R.

N remains constant in this model.

Yesterday in NL for example, we know R was at least 7, I was at least 141, and S was probably close to the total population of N=520,000 (though it is possible these values are way off the mark because most people haven’t been tested).

The SIR model was first developed in the 1920s and remains popular as a basic epidemic model today. In this model the number of new infections per day is proportional to I (the number of infected people) times the fraction S/N (which is the share of the population that is susceptible). This makes sense because each new infection requires “contact” between an infected person and a susceptible person, and we assume the share of an infected person’s contacts who are susceptible is the same as the share of the general population that is susceptible.

More precisely, the number of new infections per day should equal b x I x (S/N), where b is the average number of people a virus carrier would infect in a day if everyone he interacted with were susceptible to infection. The goal of many public health interventions is to make b smaller. This can be done by reducing the number of contacts between infected and susceptible people (social distancing, isolation) or by reducing the likelihood that such contact will transmit the disease (hand washing, not touching your face). There is also the possibility of reducing the share of susceptible people, S/N, say with a vaccine (which researchers are working on) or by spreading the infection in a controlled way to produce herd immunity (which is guiding action in Sweden).

In the early stages of an epidemic almost the entire population is susceptible, which means S/N is nearly one. In this case the number of new infections per day, b x I x (S/N), is pretty close to b x I. This becomes the rate of growth used in the exponential growth model, which explains why infections tend to grow exponentially in the early stages of an epidemic.

The SIR model (unlike the exponential model) also assumes that each day some fixed percentage of infected people recover (or die). For COVID-19, you might expect one in eleven infected people to recover each day, since the average duration of the disease is believed to be 11 days. Lastly, the model assumes that people who recover from the disease are immune thereafter (something not yet proven for COVID-19).

The SIR model is most conveniently described using differential equations and you can find a more detailed description here. Researchers at MUN created an online calculator you can play with, that applies the SIR model to the COVID-19 outbreak in NL.

Compartmental Models

The basic idea behind the SIR model is to compartmentalize the population into three compartments—susceptible, infectious, recovered—and then introduce simple rules describing the rate at which people transition from one compartment to another. One can build more sophisticated models by introducing more compartments and different transition rules.

For example, we might divide infectious people into symptomatic and asymptomatic people, since the evidence is that symptomatic people infect others at a greater rate. We might also break compartments up by age group since the disease affects old people differently than young people. Other models might include a compartment keeping track of how many people have died from the disease, and take into account limited hospital resources. Once you have the basic idea for how these models work, the possibilities are endless.

Here are some compartmental models of the COVID-19 epidemic that I found interesting:

This model created by a group at the University of Basel in Switzerland is nice because it includes demographic and hospital data from Canadian provinces, so you do not have to input that data by hand.

Here is another model created and maintained by a group at Harvard and the University of Pennsylvania.

What I find most striking about these models is how sensitive outcomes are to small changes in assumptions. For example, in the Swiss model, the difference between “strong mitigation” and “moderate mitigation” could be the difference between a handful of lives lost and thousands of lives lost in NL. The calculators also give a sense of how quickly things can go awry if we aren’t careful.

It seems to me that if we consider how sensitive these models are to changes in assumptions, how uncertain we are about the disease, and the calamitous consequences of doing too little, then we really ought to err on the side of caution in dealing with this epidemic, at least until we are confident that we know what works. As Dr. Theresa Tam said a few days ago, we should have a better idea how well current measures are working later this week.

Stay safe!

*In the exponential model, you get from the doubling period of three days to the infection growth rate b, using the formula b = 2^(1/3) -1 =0.26. This means that an infected person will infect another person every four days on average. But since the number of infected people grows each day, the doubling period is three days rather than four days.

Chart by Phoenix7777.

The Independent is 100% funded by its readers. Your pay-what-you-can subscription or one-time donation provides a base of revenue to keep our bills paid and our contributors writing. For as little as $5 a month, you can fund the future of journalism in Newfoundland and Labrador.

Donate