Data and More Data; Good Data or Not so Good Data?

Good reliable data is necessary for public authorities, aided by medical professionals, to formulate sound strategies for dealing with COVID-19. An understanding of the progression of the pandemic, infection rates, death rates, and other data points will drive community efforts, treatments, outcomes, and economic decisions. But do we have good data concerning COVID-19? What is the infection rate? Does it vary by region? Is it affected by warmer weather? What is the death rate? The survival rate? Do survivors have immunity? If so, how strong is the immunity? How long does the immunity last?

At this writing, the paucity of good data is manifest. Certainly, there is no consensus on the meaningful data points. Yet, communities in the United States are relaxing social distancing restrictions aimed at curbing the spread of the coronavirus, trillions of dollars have been committed by the U.S. government and the Federal Reserve to mitigate the economic impact of social distancing and the shuttering of businesses, and treatment strategies are being deployed. One major hospital system commented in a webinar this week that it had expended $75 million to acquire equipment in an effort to prepare for the influx of patients it feared was forthcoming, but, apparently as a result of social distancing, it had only 62 current coronavirus hospitalizations. So, what do we know?

A good source of current data is found at Our World in Data (“OWID”) is an online publication featuring data on various global issues, including poverty, disease, hunger, climate change, war, existential risks, and inequality. It was founded in 2011 and is based at the University of Oxford’s Oxford Martin School. OWID, a nonprofit, has received grants from the Bill and Melinda Gates Foundation and the Nuffield Foundation. The BBC, The Washington Post, The New York Times, The Economist, and many other major publications regularly utilize data from OWID.

OWID data includes total confirmed COVID-19 deaths, total confirmed deaths per capita, total confirmed deaths by region, total and daily confirmed deaths, and many more data sets. Data is displayed in charts and can be downloaded in .csv format. OWID initially highlights three points. “All three points are true for all currently available international data sources on COVID-19 deaths.

  • “the actual total death toll from COVID-19 is likely to be higher than the number of confirmed deaths – this is due to limited testing and problems in the attribution of the cause of death; the difference between reported confirmed deaths and total deaths varies by country
  • “how COVID-19 deaths are recorded may differ between countries (e.g. some countries may only count hospital deaths, whilst others have started to include deaths in homes)
  • “the reported death figures on a given date does not necessarily show the number of new deaths on that day: this is due to delays in reporting.”

Testing is essential. “No country knows the total number of people infected with COVID-19. All we know is the infection status of those who have been tested. All those who have a lab-confirmed infection are counted as confirmed cases. This means that the counts of confirmed cases depend on how much a country actually tests. Without testing there is no data. Testing is our window onto the pandemic and how it is spreading. Without data on who is infected by the virus we have no way of understanding the pandemic. Without this data we can not know which countries are doing well, and which are just underreporting cases and deaths. Because testing is so very crucial to understanding the spread of the pandemic and responding appropriately [OWID has] focused [its] efforts on building a global dataset on COVID-19 testing.” See

OWID notes, “Some countries present comprehensive, detailed and regularly updated data. Iceland … is one of these countries. Estonia … goes even further, showing breakdowns by age, gender and region. For many countries however, available data on testing is either incomplete or else completely unavailable. This makes it impossible for their citizens and for researchers to assess the extent and significance of their testing efforts. Our current knowledge of COVID-19 testing – and more importantly of the pandemic itself – would be greatly improved if all countries were able to report all the testing data available to them in the way shown by the best examples. … Those countries that do publish testing data often do not provide the required documentation to make it clear what the provided numbers precisely mean, and this is crucial for meaningful comparisons between countries and over time.” OWID provides a checklist for the provision of COVID-19 testing data.

Subject to these limitations and among the data provided are, total COVID-19 tests by region, tests per 1,000 persons, and tests per day. Again the data may be downloaded. “No country knows the true number of people infected with COVID-19. All we know is the infection status of those who have been tested.”

Even the number of reported tests presents issues. “Countries are reporting testing data in different ways: some report the number of tests, others report the number of people tested. This distinction is important – people may be tested many times, and the number of tests a person has is likely to vary across countries.” Moreover, “A further complication with using testing coverage as an indicator of reliability, is that the number of tests needed to have an accurate picture of the spread of the virus varies over the course of an outbreak.

“At the beginning of an outbreak, where the number of people infected with the virus is low, a much smaller number of tests are needed to accurately assess the spread of the virus.

“As the virus infects more people, testing coverage also needs to expand in order to provide a reliable picture of the true number of infected people.

“In some countries the number of tests are many times higher than the number of confirmed cases. As of 11 April, in Vietnam more than 400 tests had been conducted for each confirmed case. In Taiwan and Russia there had been around a hundred tests for each confirmed case.

“But in other countries testing is very low relative to the number of confirmed cases. The US, the UK and Ecuador had performed around 5 tests or fewer for every confirmed case.”

In addition to these deficiencies, OWID reports data for only 78 countries. “For some countries we are aware of a source of data and are currently in the process of adding it to our dataset. For others we are not aware of any official source of testing data.”

So far we have only tested about 5 million people in the U.S., less than 2 percent of the population. As of this writing, there are at least 984,000 confirmed cases and 55,681 deaths. Paul Romer[1] estimates that a random selection of 7 percent of the population should be tested each day. Curently, persons seeking a COVID-19 test are typically told that they qualify for a test only if they have three symptoms associated with the disease: high fever, cough and shortness of breath. After looking at recommended testing levels, the Kaiser Family Foundation concluded, “There is not yet consensus over what approach to testing is required for social distancing measures to be loosened, or exactly how much capacity is needed. But, by any measure, it is clear that we are far from being able to do enough tests to enable us to move to the next phase of responding to the pandemic in states across the country.”[2] Across the U.S., governors are calling for more testing capability and support. Providing that is essential and should be of the highest priority. We simply cannot determine the progression of the pandemic or adopt cogent strategies to protect Americans without rigorous, extensive, and accurate testing.

[1] Paul Romer is a former World Bank chief economist who won the Nobel Prize in 2018 for modeling the United States and global economies.

[2] “What Testing Capacity Do We Need?” by Jennifer Kates Follow, et al.