This OSF project is associated with the following study, which is available on [PsyArXiv](https://osf.io/preprints/psyarxiv/afuh9):
- NN. *Regression and random forests: Synergies for variationist corpus research*. PsyArXiv. https://doi.org/10.31234/osf.io/afuh9
The study was presented at *ICAME 2024* in Vigo, Spain. The **presentation slides** can be found [here](https://osf.io/f3m4d).
This is the **abstract**:
- *Logistic regression and random forests are widely used modeling approaches in corpus-based work. As most studies tend to focus on one form of analysis, this paper demonstrates how their strengths may be combined in variationist corpus research. We outline a general strategy that starts with a basic regression structure and then examines random forest predictions to see whether elaborations to the initial model are needed. This dialog capitalizes on the flexibility of random forests, which are able to capture non-linear relationships and complex interaction patterns. Our case study on the choice between that- and -ing-complement clauses after the verb regret allows us to illustrate how assumptions about the causal relations between variables may account for discrepancies between the models. We also draw attention to some limitations of our complementary approach, including the fact that the models operate on different scales, which may compromise the comparability of interaction patterns.*
**Images** created for this study can be found in the folder "output/figures". They are published under a Creative Commons Attribution 4.0 licence (**CC BY 4.0**), which means that the licence terms for their use are quite generous (see http://creativecommons.org/licenses/by/4.0).