This page documents the analysis for the linked paper, where natural language processing is used to analyse written first impressions of faces. Using topic modelling via non-negative matrix factorisation, a two-topic model structure is uncovered. This model is then compared to trait ratings of the same faces, using Bayesian regularised regressions.
The code is written in Python and is shared in the extensive accomanying notebook. While all the free text data for Study One is present here, relating the topic model to trait ratings will *not* be reproducible straight out of the box - this is because the notebook will require both the images and the ratings from the 10k Face Database (Wilma Bainbridge et al., 2013), and this is simply not our data to share. The file `psychology_attributes.xlsx` is used extensively, and the full image set is used to create composites. The data can be obtained from this [link][1], and if you place the `psychology_attributes.xlsx` file in the `Misc` folder, and the faces and associated landmarks in the subfolder `Face Images` the notebook should have no trouble in fully reproducing the analyses in the manuscript.
[1]: https://www.wilmabainbridge.com/facememorability2.html