This folder contains R scripts for the replication of the central findings described in "A large quantitative analysis of written language challenges the idea that all languages are equally complex".
---
* The file names correspond to figures/sections in the main text/supplementary information.
* Please adapt your working directory at the beginning of the scripts.
* Please check for instances of parallel computation (e.g., by searching for 'mc.cores' in the scripts) and adapt the parameter 'mc.cores' to a reasonable number for your machine.
---
Provided R scripts:
-------------------
* Fig2.R: Produces Figure 2 of the manuscript (Testing the similarity of prediction complexity rankings).
* LMM_prediction_popsize.R: Final LMMs and Delta-AICs for word level, character level and Crubadan for the evaluation of the association between prediction complexity and population size
* S7.R: Produces Supplementary Figure 22 and figures in the caption (Section S7: Relative standard deviations for h and r)
* S8.R: Produces Supplementary Figures 23 and 24 in Section S8: Correlations between entropy rates and learning difficulty
* S9_SuppTab6.R Produces Supplementary Table 6 in Section S9: Evaluating the similarity of complexity variables
* VoxClam_convert_epitran.R: Converts Vox Clamantis data as described in the Methods section of the manuscript (data has to be downloaded first from the URLs indicated in the manuscript and the R script).