Main content
How to develop reliable instruments to measure historical sentiments?
Date created: | Last Updated:
: DOI | ARK
Creating DOI. Please wait...
Category: Project
Description: The development of computational social sciences has enhanced our ability to use databases of digitized culture to quantify the past. Concomitantly, new historical econometrics tools have allowed the estimation of socio-economic variables further into the past. Together, these historical cultural and socioeconomic data allow an unprecedented capacity to describe the relationship between culture, historical events and socioeconomic dynamics. Here, we focus on the analysis of texts using bags-of-word frequencies, describe potential challenges, and propose a pipeline to improve validity and generalizability. In particular, while the gold standard approach for bags-of-words – the Linguistic Inquiry and Word Count – has been validated with psychometric experimentation in modern participants, it has two main limitations. First, it is limited in the number of variables that we can explore; and second, because it has been validated for modern language users it might not be valid for other historical contexts. Here we offer a complementary approach which ensures the i) historical adequacy of the search terms, ii) the measurements’ internal coherence and iii) external validation vis a vis other tools. We present the pipeline, examples and scripts which might assist junior researchers to develop custom bags-of-words and conduct their own analysis of historical texts.