This repository contains the R code and data for the manuscript titled "Modeling nonlinearity and systematic variability in second language complexity: Task iteration in focus".
1. **data.all.csv** includes the data used in the study. It consists of four columns:
- `writingid`: Unique ID assigned to each essay
- `wave`: Wave of data collection (i.e., 1 for the first essay, 2 for the second essay, and so forth)
- `id`: Unique ID assigned to each student
- `writing_sample`: Essay itself
2. **R_Code.html** includes the R code used for the extraction and analysis of data.
3. **SpellingVariationToCorrect.csv** and **HyphenatedWordsToCorrect.csv** list word pairs with orthographic variation, used for calculating lexical sophistication.
4. **material.docx** includes the writing tasks administered to participants in the study.