Interpreting a SIR Graph

Using Math to Predict the Spread of Infectious Diseases

Many diseases spread from person to person. Some, such as the common cold, are a seasonal nuisance, but others like plague, flu, smallpox, typhus and Ebola have killed thousands and even millions of people. Yes, millions. During the 1918-9 flu pandemic more than 50 million people died. Also, smallpox killed nearly a half billion people in the centuries before it was eradicated in 1979, and the plagues that raced through medieval Europe killed an estimated 25 million people.

There has been no major pandemic since 1918-9 but scientists are busy studying how an infectious disease spreads so that they will be prepared to contain the next big outbreak. Spread of some diseases (such as malaria) is by mosquito bites or other animal to human contacts, but Ebola and other potentially catastrophic diseases spread through transfer of germs from one person to another. Some contacts – and hence exchange of germs – occurs by touching bodily fluids such as sweat, tears, saliva, blood, semen and urine. A more efficient spread occurs by coughing and sneezing that propels germs short distances through the air. One at a time, person to person contacts might seem a slow way to propagate a disease, but with all the contacts that occur in a large city in a single day the spread can be surprisingly rapid.

The way to slow and ultimately stop an infectious disease outbreak is to get people out of contact with each other. How can that happen?

We already routinely do it when we get a bad cold. We stay at home, which allows rest to speed up our recovery, and reduces the number of other people who are exposed to us. In trying to reduce the spread of the 2014 enterovirus outbreak some schools closed so that students would not be packed in classrooms where the disease could spread easily. Indeed, the 1918 flu killed millions of soldiers, partially because they were near each other in barracks and cramped hospital wards.

Traveling in any public transportation vehicle such as a bus, subway, train or airplane concentrates people into small spaces, making airborne transmission of disease easier, especially if the air is re-circulated.

An important way to minimize the spread of a disease within a town is to have people stay home. With the Internet, students and adults can work at home so that learning and productivity doesn’t have to stop because of the fear of getting sick. But also, having people work at home reduces the numbers of customers at cafes and shops so any disease outbreak is also bad for business, as well as for personal health.

The spread of disease from city to city and country to country depends on transportation. Marco Polo and other traders took months to carry diseases across Europe to Asia and bring new ones back, and the explorers in the early 1500s travelled weeks to months to bring smallpox of the New World, which is estimated to have killed off as many as 90% of all native Americans. Today, the spread of disease across the globe takes only hours or days because spread is largely tied to airplane travel. You may remember that in 2014 some countries refused to accept airline passengers who came from parts of Africa where Ebola was rampant.

In addition to these demographic factors, the virulence of the disease itself has to be included in models of disease spread. If a disease makes a lot of people sick, but very few die, as is common with mild winter flu, then the main concern is how the flu will affect work and school – how the economy will be impacted by missed days. But for a flu that is deadly, the stakes are much higher, and modeling the disease spread may predict how many hospital beds will be needed, how long the epidemic will last, how many people will become sick, and how many will die. The models can also evaluate the effectiveness of actions that can be taken to interrupt the spread – for example, vaccinations, school closings, and transportation restrictions.

Interpreting a SIR Model Graph

The SIR model is a widely used simple mathematical analysis that provides great insight into an infectious disease outbreak. Basically, each person included in the model falls in one of three categories, S = susceptible, I = infected, or R = recovered. Over time every person moves from S to I to R, with some going to the additional category of D = death. Three parameters determine how fast and how many people move from category to category: infection rate, recovery rate, and death rate.

This figure is a graphical output of a typical model. Here is how to interpret it.

The X axis plots time, specifically the number of days since the beginning of the outbreak. The Y axis plots the number of people in each of four categories for each day. The model starts with 1000 total people, two of which are already infected.

The Blue Line

The rapid decline of the blue line – the number of people who have not yet been infected – indicates that the disease is very contagious, with pretty much every susceptible person being infected by day 26. The line for a less infectious disease would slope more gently to the right.

The Red Line

The red line – the daily number of infected people – is essentially the epi-curve for the disease. It also changes rapidly up to a maximum of about 700 people on day 26, and then falls more slowly until about day 101 when nearly everyone has recovered. The reason that the increase of infected people stops at day 26 and then falls is that by then almost every susceptible person has been infected. Where the blue and red lines cross – at about day 15 – is the first day when more people are in the infected category than the susceptible one.
The slope of the right side of the red line reflects the recovery rate – how long it takes to become cured. The first person recovers on day 4, showing that the average duration of sickness is about 4 days. (On the graph at this scale, it looks like the first recovery occurs on about day 10, but the spreadsheet of actual data shows that the first recovery is on day 4).

The Green Line

The number of people who have been removed from the simulation, typically by recovering, steadily increases, leveling off near day 101, because there are essentially no more infected people who need to recover. The day when more people recover than are susceptible is where the blue and green lines cross at about day 24. Where the green and red lines cross – at about day 30 – is when recovered people outnumber infected ones. The final number of removed people is 980, twenty less that the initial population of 1000. What happened to these 20 people?

The Purple Line

The 20 people died. Removed means either recovered or died. A closeup (below) of just the death curve shows that the first death occurred on about day 13, about two weeks after the first two infected people started the outbreak. The purple line, Cumm Died, differs from the others in being the cumulative number of people who have died, not just the number who died on each day.


A useful description on this SIR model:

An online program that allows easy calculation of the SIR model:

This basic program can be modified to address the effects of vaccines, latency, and herd immunity:


What happens if initial I = 0?

What does it mean that red line increases so rapidly?

What does it mean that green line also rises rapidly, but not as rapidly?

What does it mean that the green line reaches nearly to 1,000?