Main content



Loading wiki pages...

Wiki Version:
Welcome to the data repository of the CopCo corpus! CopCo is a new eye movement corpus tailored to both psycholinguistics and natural language processing. The goal is to investigate reading behavior of Danish language in various population. To this end, we record eye movements of participants reading continuous Danish texts in their own speed. This is an active project and we are continuously adding more data and improving the feature extraction. Therefore, please make sure to check for the latest version of the data here and feel free to contact us if you have any questions or feedback ( The CopCo corpus is free to use for everyone. # Project structure - 'DatasetStatistics' contains one file with the information about the included text materials and one file with the anonymized participant details. - 'RawData' contains the EDF and result files saved from the EyeLink recording. - 'FixationReports' contains the fixation and saccade events generated by the SR DataViewer software. - 'InterestAreaReports' contains the character-level fixation information generated by the SR DataViewer software. - 'ExtractedFeatures' contains one CSV file per subject with the computed word-level reading metrics. The descriptions of the extracted features can be found [here][1]. Note that only these extracted feature files **do not** contain the eye-tracking data and the areas of interest for the comprehension questions anymore. - The link to the GitHub repository contains the code used for preprocessing and feature extraction. # Participants The CopCo corpus contains eye movement data from Danish native speakers, both from typical readers as well as readers with dyslexia. Participant P00 was a pilot participant and should not be used for data analysis. P01 - P22 are typical readers, P23 - P41 are dyslexic participants. The folder 'DatasetStatistics/' contains further details. [1]: