US-Born and Foreign-born Life Expectancy by Race and Hispanic Origin
before and during the COVID-19 Pandemic in the United States
================
<!-- README.md is generated from README.Rmd. Please edit that file -->
This is a replication package for the paper [US-Born and Foreign-born
Life Expectancy by Race and Hispanic Origin before and during the
COVID-19 Pandemic in the United
States](https://doi.org/10.1016/j.socscimed.2025.118191) in Social
Science & Medicine.
The goal of the paper was to examine the impact of the COVID-19 pandemic
on the US migrant mortality advantage in 2020, 2021 and 2022 compared to
2017-2019 and estimate the contributions of the foreign-born population
to US life expectancy by race and Hispanic Origin.
For this paper we used individual-level restricted data from the
National Vital Statistics System (NVSS) to compute death counts and a
combination of data from the Census Bureau and the 1-year American
Community Survey (ACS) files to construct mid-year population estimates.
More details about the data cleaning and methods used in the paper can
be found in the paper and in its online supplement.
## General repository structure
The repository is organized in several directories:
- `R`: Contains the R code used to clean data from NVSS, Census, and ACS
and produce the final estimates. Files are named starting with a
numeric code and should be executed in a sequence from the lowest to
the highest. Files with a letter (a or b) after the numeric code
indicate alternatives. Generally, scripts starting with 0 are for data
cleaning, those starting with 1 are for analysis, and those starting
with 2 are for post-analysis checks.
- `data`: Should contain the original NVSS, Census, and ACS data needed
to produce estimates of excess deaths. This data has been removed from
the folder because we are not allowed to share directly the restricted
NVSS data. Instructions on how to get access to the restricted NVSS
data can be found
[here](https://www.cdc.gov/nchs/nvss/nvss-restricted-data.htm). The
data from ACS can be obtained via [IPUMS
USA](https://usa.ipums.org/usa/) (as we did), or directly from the
[Census FTP
server](https://www.census.gov/programs-surveys/acs/microdata.html).
Census population estimates for pre-2020 year can be obtained
[here](https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www2.census.gov/programs-surveys/popest/tables/2010-2019/national/asrh/nc-est2019-asr6h.xlsx&ved=2ahUKEwiFz8P5vqKNAxX6KxAIHb0dL64QFnoECBcQAQ&usg=AOvVaw33c73Q9s1Nm8sDBH_UmjLy)
and those for 2020 and later
[here](https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www2.census.gov/programs-surveys/popest/tables/2020-2022/national/asrh/nc-est2022-asr6h.xlsx&ved=2ahUKEwiVm8W4v6KNAxVBPxAIHW4jKdAQFnoECBUQAQ&usg=AOvVaw0K5xThXoMr61T8ZG4fZ5A1).
We make available the intermediate and final data products in the
`output` folder.
- `output`: Contains all the data that is generated while cleaning the
raw data and creating the estimates. You can always tell which R
script produced the file by looking at the numeric code at the start
of its name.
- `figures`: contains figures and tables.
- `docs`: contains additional documentation.
- `packages`: contains some R packages which we used but that are no
longer available from CRAN.
## Notes
While we tried to fully automate the creation of all figures and tables,
during the revision process we were asked to combine certain tables,
which became too difficult to automate. All numbers in the tables in the
published version can be found in the tables available in this
repository but
## Folder Structure
Warning: not all the files listed here are present in the clean version
of the repository on OSF but we left them here to give you a sense of
where they would go.
## ~/immigration-and-life-expectancy
## ├── R
## │ ├── 01a_clean_mort_data.R
## │ ├── 01b_clean_cause_mort_data.R
## │ ├── 02_create_prop_age_ACS.R
## │ ├── 03a_create_prop_foreign_ACS.R
## │ ├── 03b_create_prop_foreign_ACS_sensitivity.R
## │ ├── 04_clean_census_estimates.R
## │ ├── 05_create_adjusted_pop.R
## │ ├── 06a_create_final_data.R
## │ ├── 06b_create_final_data_by_cause.R
## │ ├── 11a_create_lt.R
## │ ├── 11b_create_lt_low.R
## │ ├── 11c_create_lt_high.R
## │ ├── 12_GW_decomposition.R
## │ ├── 13_GW_decomposition_by_cause.R
## │ ├── 21_ACS_calculations.R
## │ ├── 22_race_coding_change.R
## │ ├── README.Rmd
## │ ├── decomp_functions.R
## │ ├── decomp_functions_MS.R
## │ ├── decomp_functions_by_cause_MS.R
## │ ├── decomp_functions_by_cause_simplified.R
## │ ├── decomp_functions_exact.R
## │ ├── leCalcsForPres.R
## │ └── lt_abridged_plus.R
## ├── README.md
## ├── data
## │ ├── ACS
## │ │ ├── usa_00094.dat.gz
## │ │ ├── usa_00094.xml
## │ │ ├── usa_00096.dat.gz
## │ │ ├── usa_00096.xml
## │ │ ├── usa_00097.dat.gz
## │ │ ├── usa_00097.xml
## │ │ ├── usa_00098.dat.gz
## │ │ ├── usa_00098.xml
## │ │ ├── usa_00099.dat.gz
## │ │ ├── usa_00099.xml
## │ │ ├── usa_00100.dat.gz
## │ │ └── usa_00100.xml
## │ ├── ACS_for_discussion
## │ │ ├── usa_00090.dat.gz
## │ │ ├── usa_00090.xml
## │ │ ├── usa_00091.dat.gz
## │ │ └── usa_00091.xml
## │ ├── ACS_for_race_coding
## │ │ ├── usa_00092.dat.gz
## │ │ └── usa_00092.xml
## │ ├── census_pop
## │ │ ├── nc-est2019-asr6h.xlsx
## │ │ ├── nc-est2022-asr6h.xlsx
## │ │ └── nc-est2023-asr6h.xlsx
## │ │ └── Table05.xlsx
## │ ├── raceAdjFactors.csv
## ├── figures
## │ ├── 06_sample_table.docx
## │ ├── 07_e1_gap.svg
## │ ├── 07_e1_summary_table.docx
## │ ├── 07_e1_table.docx
## │ ├── 07_e1_trend_bars.svg
## │ ├── 07_e1_trend_lines.svg
## │ ├── 07_e1_trend_lines_for_pt.svg
## │ ├── 07_ratios_and_e1.svg
## │ ├── 07_ratios_plot.svg
## │ ├── 07_ratios_plot_for_pt.svg
## │ ├── 07b_e1_gap.svg
## │ ├── 07b_e1_summary_table.docx
## │ ├── 07b_e1_table.docx
## │ ├── 07b_e1_trend_bars.svg
## │ ├── 07b_e1_trend_lines.svg
## │ ├── 07b_e1_trend_lines_for_pt.svg
## │ ├── 07b_ratios_and_e1.svg
## │ ├── 07b_ratios_plot.svg
## │ ├── 07b_ratios_plot_for_pt.svg
## │ ├── 07c_e1_gap.svg
## │ ├── 07c_e1_summary_table.docx
## │ ├── 07c_e1_table.docx
## │ ├── 07c_e1_trend_bars.svg
## │ ├── 07c_e1_trend_lines.svg
## │ ├── 07c_e1_trend_lines_for_pt.svg
## │ ├── 07c_ratios_and_e1.svg
## │ ├── 07c_ratios_plot.svg
## │ ├── 07c_ratios_plot_for_pt.svg
## │ ├── 08_cum_contribs_plot.svg
## │ ├── 09_contributions_by_group_and_causes.docx
## │ ├── 09_cum_contribs.svg
## │ ├── 09_cum_contribs_by_detailed_causes_plot.svg
## │ ├── calculations_for_discussion_ST5.xlsx
## │ ├── prop_by_division.docx
## │ └── prop_by_metro.docx
## ├── immigration-and-life-expectancy.Rproj
## ├── output
## │ ├── 01_NVSS_table.csv
## │ ├── 01_NVSS_table_by_cause.csv
## │ ├── 02_below5_above85_ACS_proportions.csv
## │ ├── 03_foreign_ACS_proportions.csv
## │ ├── 03_foreign_ACS_proportions_sensitivity.csv
## │ ├── 04_census_pop.csv
## │ ├── 05_pop_adjusted.csv
## │ ├── 06_final_table.csv
## │ ├── 06_final_table_by_cause.csv
## │ ├── 11_e1_by_group.csv
## │ ├── 11_e1_summary_table.csv
## │ ├── 11_life_tables_US.csv
## │ ├── 11_life_tables_foreign.csv
## │ ├── 11_life_tables_total.csv
## │ ├── 11b_e1_by_group.csv
## │ ├── 11b_e1_summary_table.csv
## │ ├── 11b_life_tables_US.csv
## │ ├── 11b_life_tables_foreign.csv
## │ ├── 11b_life_tables_total.csv
## │ ├── 11c_e1_by_group.csv
## │ ├── 11c_e1_summary_table.csv
## │ ├── 11c_life_tables_US.csv
## │ ├── 11c_life_tables_foreign.csv
## │ ├── 11c_life_tables_total.csv
## │ ├── 12_GW_decomp_full.csv
## │ └── 13_G_decomp_by_cause_full.csv
## ├── packages
## ├── MortalitySmooth_2.3.4.tar.gz
## └── svcm_0.1.2.tar.gz