Main content
Data-driven identification of situated meanings in corpus data using Latent Class Analysis. Supplementary materials
Date created: | Last Updated:
: DOI | ARK
Creating DOI. Please wait...
Category: Project
Description: Identifying the meanings of grammatical elements in context is a major challenge for corpus-linguistic studies of grammatical variation. This study proposes a novel solution to this problem. I describe the situated meanings of grammatical elements as latent constructs, i.e., social concepts that cannot be observed directly but need to be inferred from the way that speakers behave. I use Latent Class Analysis (LCA) to create a data-driven typology of meanings for three modal periphrases in spoken Spanish, and compare this typology to manual classification of the data in terms of modality. My findings show that (a) the situated meanings identified by the Latent Class Analysis do not directly correspond to the modal meanings that are commonly assumed to govern the variation between the three periphrases, and (b) the data-driven typology of meanings is better in explaining the variation between these periphrases. This work will be published as a research article with the title "Data-driven identification of situated meanings in corpus data using Latent Class Analysis" in Open Linguistics (De Gruyter).