graphic representation of data science with numbers and symbols on screen

Mathematicians forecast COVID-19 spread

Researchers from Case Western Reserve, University of Akron, develop algorithm used by Cuyahoga County to predict fluctuations in number of COVID-19 cases

A highly accurate forecast tool created by mathematicians from Case Western Reserve University and The University of Akron is being used by Cuyahoga County health officials to gauge the Northeast Ohio spread of the virus that causes COVID-19.

“Our predictions of the number of new daily infections are quite accurate for two to three weeks ahead, and have been for a few months,” said Daniela Calvetti, the James Wood Williamson Professor of Mathematics, Applied Mathematics and Statistics at Case Western Reserve. “This level of accuracy has been able to help institutions plan for the rise—and lately the downturn—in cases instead of reacting to them as they happen.”

Calvetti is one data scientist among an informal group of researchers from the two institutions who developed the model. Their early findings were first made public in mid-June at a news conference with the Cuyahoga County Board of Health, which continues to use the data for its forecasts, said group member Johnie Rose, another Case Western Reserve scientist who has worked with the Board of Health for several years on other data projects.

An example of the graphs presented by the group to Cuyahoga County.
An example of the how the county-by-county projections are presented.

For now, the group provides regular forecasts to the Board of Health and other institutions like hospitals, but Calvetti and Rose said they plan to eventually post them online for public view. Calvetti has posted some slides illustrating the research on her own site.

“People are hungry for these numbers right now to inform their staffing decisions at hospitals or in other critical services,” said Rose, a preventive medicine and public health physician and epidemiologist from the Center for Community Health Integration at the Case Western Reserve School of Medicine. 

“But the general public might like to see this information to help make decisions—and to illustrate the real-time, real-life importance of social distancing or wearing masks or limiting travel to slow the spread of the virus,” he said.

Data scientists collaborate

The group began collaborating on a variety of COVID-19 research projects in early April, soon after the confirmation of a pandemic scattered scientists and others to various stay-at-home locations, Calvetti said. 

Bayesian Analysis

The mathematical underpinnings to the research may seem somewhat obtuse to the average, possibly math-challenged, citizen, but Daniela Calvetti said mathematics and data science fundamentals have also gained a certain popularity in recent months.

She said many people are getting a lesson in what is known as “Bayesian” analysis, referring to the mathematical philosophy and process for quantifying uncertainty in terms of probability.

This type of analysis—anchored by the call to update your prior beliefs with new observed evidence—is named after Thomas Bayes, an 18th-century Presbyterian minister and mathematician. 

“This is the reason why our predictions are better than many others,” she said. “We update our prior assumptions on the basis of the data, and we run several possible scenarios, discarding those that are not in agreement with the data.

The paper, published in the journal Frontiers in Physics, details the model’s underlying mathematical theories and calculations.

Among the authors were Calvetti, Rose and Erkki Somersalo, a mathematics professor at Case Western Reserve; and Alex Hoover, an assistant professor of applied mathematics at The University of Akron.

The paper laid out a model that accounted for transmission of the virus by infected-but-asymptomatic individuals as well as commuting patterns to flesh out differences in the spread of the virus in rural and urban settings. 

Their model uses census population and mobility data (not personal information) to model commuter traffic, and “the geographic dynamics of the contagion show a role of major highways from larger cities to surrounding suburban and rural areas,” Calvetti said.

Hoover said that when the region shut down, he began to think about how metropolitan areas are connected by highways and how they functioned as a network to spread the disease. 

“Even though much of our business was shut down, essential workers were still going to work, and if they live in one county and work in another, then there was still a chance to spread the disease farther,” he said, adding that the model showed the “disease hit dense urban counties hard at first, before following the highways to smaller rural counties.”

What’s next?

R0, or R-naught, is a number health organizations use to project whether an outbreak will spread. Simply put, R0 is the average number of people in a population that has not been infected who will catch a new disease from a single infected person.

So, if the R0 is greater than 1, the infection will probably keep spreading, and if it’s less than 1, the outbreak will likely recede. 

Calvetti and Rose both cited a Ro number below 1 in Cuyahoga County as a predictor that the number of infections will continue to decrease. But that prediction should not be read as permission to relax, Calvetti said. 

“Until there is a vaccine, complacency has no place anywhere, regardless of the density of population or projections,” she said. “Because, if people move, so does the virus, and once it reaches (again) densely populated areas it may cause a second outbreak.”

For more information, contact Mike Scott at

This article was originally published Aug. 26, 2020.