Main content

Wiki | home Discussion


Toggle view:


Loading wiki pages...

Wiki Version:

The most up-to-date data files can be found in Data/. These are built daily (when possible). Each file is also given a sequential version number by OSF. Each zip file contains a csv by the same name.

Data dictionaries

input database (inputDB.csv)

This file is in long format, meaning that there is only a single Value column, which could mean different things depending on the other structuring variables (Metric, Measure, etc). Not all rows captured in this file make it through the processing chain to the harmonized output files.

Stable url:

Column Type Definition Valid values Examples
Country string Country name "United Kingdom", "Philippines", "Gambia"
Region string Name of the region to which data refers. It could indicate either national or subnational population "NYC","
Code string location short code & date "US_NYC01.05.2020"
Date string Date in which the cumulative value was reached for cases, deaths, and tests Date format is "01.05.2020"
Sex string reported sex of the case, death, CFR, or test. f for female; m for male; b for both
Age string lower integer bound of age interval [0,...104], or "UNK" or "TOT" Values between 0 and 104, UNK for unknown, and TOT for all ages
AgeInt integer age group width expressed in years integers ranging from 1 to as high as 60. If Age is TOT or UNK then AgeInt is NA
Metric string units used to indicate the amount of cases, deaths, and tests Count, Fraction, Ratio
Measure string substantive measure recorded Cases; Deaths; Tests, ASCFR
Value double numeric value of the count, fraction, or ratio reported Any numeric value above 0

output files (Output_5.csv and Output_10.csv)

Output files always cover the same age groups and age range and age groups for each subset of data recorded (location and date). All data has been harmonized to count metrics, and the only values potentially reported include Cases, Deaths, and Tests.

Stable urls: (Output_5.csv) and (Output_10.csv)

Column Type Definition Valid values Examples
Country string Country name "United Kingdom", "Philippines", "Gambia"
Region string region to which data refers, it could refer "NYC","
Code string location short code & date "US_NYC01.05.2020"
Date string Date in which the cumulative value was reached for cases, deaths, and tests Date format is "01.05.2020"
Sex string reported sex of the case, death, CFR, or test. f for female; m for male; b for both
Age integer lower integer bound of age interval [0,5,10,15...100] [0,5,...100] or [0,10,...100]
AgeInt integer either 5 or 10 5,10
Cases double cumulative count of cases, rounded to 1 decimal point. >= 0
Deaths double cumulative count of deaths, rounded to 1 decimal point. >= 0
Tests double cumulative count of tests, rounded to 1 decimal point. >= 0


Getting started

A small tutorial for getting started in

  • R is available here


An overview of the database contents and characteristics is given in these dashboard views.

As we are still in the process of completing aspects of the metadata, such as definitions used, this is not yet released in web format, however, you can view a tabular representation of the metadata gathering progress here

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.