The files contained in this project can be used to re-create the analyses for the article "Improving Data Access Democratizes and Diversifies Science".
The whole folder can be downloaded as a zip-file.
The code folder (named 'scripts') contains the following Stata do files, R scripts, and Python scripts:
- main.do (with a corresponding set of ado files)
- landsat_dist.R
- word_analysis.R
- geoparse.py
- get_pid.py
The main.do Stata file provides the road-map for the paper's analyses (and for reproduction). Please refer to this do-file for guidance on the proper order to run the Stata, R, and Python programs.
***Important note***: the main.do file cannot be run all at once. The programs must be run sequentially, and there are notes in the main.do file that indicate when the R and Python scripts must be run before continuing with the following Stata programs.
The raw data needed to run all the analyses can be found in the 'rawdata' folder.
The 'filedata' and 'tables' folders are empty, but provide the folder structure where the intermediate data and final output (i.e. tables and figures) of the paper will be saved.