The Comparative Panel File (CPF, <http://www.cpfdata.com>) is an open science project to harmonise the world's largest and longest-running household panel surveys from seven countries: Australia (HILDA), Germany (SOEP), Great Britain (BHPS and UKHLS), Korea (KLIPS), Russia (RLMS), Switzerland (SHP), and the United States (PSID). The project aims to support the social science community in the analysis of comparative life course data. The code integrates individual and household panel data from all seven surveys into a harmonised dataset that contains 2.7 million observations from 360 thousand respondents, covering the period from 1968 and up to 40 panel waves per respondent (Version 1.0 released in 12.2020). The project is organised as an open science platform that integrates tools for general communication (online forum), code development (GitHub code repository), and general management of scientific research (Open Science Framework, OSF). After securing access to the national panel surveys, users can run our code which combines datasets and waves within a country, constructs harmonised variables, and merges these into one data set for all countries and all waves. CPF is the first open-source data harmonization initiative of this type in social sciences and provides an attractive alternative for institutionalized harmonization approaches. The project has been developed by Konrad Turek, Matthijs Kalmijn and Thomas Leopold. I will present the background, design, and content of the CPF, provide an overview of data and the research potential, and explain the open-science platform. I will also share my thought on the development and reception of the initiative.
Other conference materials: <https://osf.io/meetings/OSCTUC2021/>