Main content



Loading wiki pages...

Wiki Version:
Researchers looking to deal with binned income data typically use either nonparametric (including midpoint) or parametric estimation methods in order to estimate summary statistics from these categories. While each of these methods has its advantages, they all carry assumptions which cause them to deviate in important ways from real world distributions of income. The method developed here, Random Empirical Distribution Imputation (REDI), enables researchers to impute discrete observations as part of a continuous distribution using binned income data, while also calculating summary statistics. REDI achieves this through random cold-deck imputation from a real world reference dataset (here, the Current Population Survey Annual Social and Economic Supplement). This imputation method is particularly helpful for reconciling bins to be comparable between datasets or across years and for dealing with top incomes. REDI has the advantages of computing an income distribution that is nonparametric, bin consistent, area preserving, continuous, and computationally fast. I provide proof of concept using two years of the American Community Survey.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.