This OSF folder contains the data and analyses from an artificial language learning experiment comparing learning in a production focused condition to a comprehension focused condition. More information about this project, including links to the published article, can be found in the parent folder: https://osf.io/p28uh/
Below here are explanations of the folder organization, the workflow for analyses and an explanation of data columns (codebook). If you have any questions about this at all, please do not hesitate to contact me at hopman@wisc.edu.
The raw data logs are all contained in the 'log all' folder, and are duplicated and sorted in the 'log usable' and 'log unusable' folders. The file 'logsmerger.py'was then run to append data from the logs by task. 'VTlogsall.csv' has prescreen threshold data for all participants and is analyzed in 'PrescreenThresholdAnalysis.R'. The other three logs only contain data from usable participants (based on prescreen threshold) and are analyzed in 'Analysis.Rmd', the results of which are shown in 'Analysis.html'.
Notes:
- for subject 148 the computer crashed before the end of the learning phase; during testing of subjects 187 and 189 the fire alarm went off before the end of the learning phase. Thus, none of these three subjects took the threshold test or any other test, and they were unusable.
- we accidentally put in subjectnr 196 twice. For the second person run under subject number 196, data was saved under 196-2. The logmerger changes this subjectnr to 222 (the first available even subjectrnumber) in the combined csv log files (Tlogs.csv etc) so that R knows it is a different participant from 196.
About the log files generated by psychopy:
Logs start with tables for each of the elements of the language (e.g. color) to record what the mapping for this specific participant was between the visual (left column) and the auditory (right column) stimuli. The final one, suffixes, codes for which pair of suffixes (us/usu or ok/oko) was used for the nice and the scary monsters. See the folder with stimuli for explanations and records of stimuli name/numbering.
About the columns in the actual data:
during training (Tlogs):
- conditionnr: 1 = comprehension, 2 = production
- trialtype: P = Passive, AC = Active Comprehension (match/mismatch), AP = Active Production
- itemtype: msg = monster singular, mpl = monster plural, c = color, cm = colored monster, p = pattern, NP = Noun Phrase (colored, patterned monster), l = landscape, v = verb, full = full sentence
- correctanswer?: 1 = correct, 0 = incorrect, - = not applicable to this trial (during training only the responses during active comprehension trials were scored as correct/incorrect by psychopy)
- key: key the participant pressed during the trial (only applicable to AC match/mismatch trials), f = f key to indicate mismatch, l = l key to indicate match; f and l keys had stickers on them to indicate mismatch (isnot) and match (=)
- match/mismatch: mismatch = picture and sound do not match, yesmatch = picture and sound do match; again only applicable to AC trials.
- RT: time from end of the soundfile till buttonpress (in AC match/mismatch trials only; these trials were not speeded, participants were never prompted to do this as fast as possible)
- sound: location and name of the sound file played on this trial
- pic: location and name of the (correct) picture for this trial (in case of v and full items it was an animation and this is the location of the first frame of that animation)
- foilpic: location and name of the picture shown initially during an AC match/mismatch trial. During match trials, foilpic = pic.
during the threshold pretest (VTlogs - VT = vocabulary test), only new/different columns are explained
- trialtype: VT = vocabulary test
- trialtype: m = monster, c = color, p = pattern, l = landscape, v = verb
- key: x = x key to indicate that the word matches the picture on the left of the screen, m = m key to indicate that the word matches the picture on the right of the screen.
- locationcorrectanswer: right = the correct picture that matched the word was shown on the right of the screen, left = the correct picture that matched the word was shown on the left of the screen
- RT: time from the start of the trial untill buttonpress (buttonpress could occur before the end of the auditory file and ended the trial)
- totsounddur: total sound duration, the duration of the word that was played on this trial
- pic: file location and name of the correct picture (or first slide of animation) that matched with the word that was being played auditorily
- foilpic: file location and name of the incorrect picture (or first slide of animation) that did not match with the word that was being played auditorily
during the Forced Choice task (FClogs)
- trialtype: FC = Forced Choice
- itemtype: FCvocabmons = Forced Choice vocabulary in phrases monster, FCvocabcolr = Forced Choice vocabulary in phrases color, FCvocabpatt = Forced Choice vocabulary in phrases pattern, FCvocabvvid = Forced Choice vocabulary in phrases verb video, FCvocablvid = Forced Choice vocabulary in phrases landscape video (used for vocabulary in phrases test, with itemtype denoting which part of the picture was different between the correct and the foil picture and thus which word was needed to pick the correct answer); FCsemanitem = Forced Choice semantic item, FCnumbritem = Forced Choice number item (used for the suffix understanding test, with the foil picture for semantic trials containing a monster of the other (semantic) nice/scary category and the foil picture for number trials containing the same monster(s) but a different number (sg/pl) than the target picture); FCprobitems = Forced Choice probable items, FCimpprbitem = Forced Choice improbable items (used for the probabilistic co-occurrence test, not in CUNY abstract, in supplementary online materials of submitted paper)
- critword: the place in the auditory phrase of the first disambiguating word that could help a participant choose the correct picture; 1 = determiner, 2 = color, 3 = monster, 4 = pattern, 5 = preposition ot, 6 = verb, 7 = landscape; for vocabulary in phrases test items (FCvocab...) the target word was the one specified in itemtype; for suffix understanding test items (FCseman and FCnumbr) the target word was the determiner since that codes for both number and monster type; for probabilistic co-occurrence test items (FCprob and FCimpr) the target word was the pattern since that was the first reliable word to disambiguate between the two pictures.
- wordduringwhichtheypressed: the place in the auditory phrase of the word during which the participant presses the button to end the trial, with 1-7 same as before and 8 = participant pressed after the end of the auditory phrase
- totaltime: total duration of the auditory phrase
- timew1 through timew7: the duration of words 1-7 in the auditory phrase (phrases were created by concatenating recordings of individual words, these durations are the durations of those source files with single words); since not all phrases in this task consisted of a full sentence, if a time is 0 this means that word type was not in the phrase.
- cumtimew1 through cumtimew7: cumulative time of the auditory phrase including this word, e.g. cumtimew3 = timew1 + timew2 + timew3.
during the Error Monitoring task:
- itemtype: EM = Error Monitoring
- trialtype: EMsw1 = Error Monitoring switch 1 error with new word order 1324567 (switch monster and color), EMsw2 = Error Monitoring switch 2 error with new word order 1273456 (landscape between color and monster), EMsw3 = Error Monitoring switch 3 error with new word order 1234657 (switch preposition ot and verb), EMsw4 = Error Monitoring switch 4 error with new word order 1345627 (color in between verb and landscape) (used for the Word Order Error Test - multiple erroneous word orders were created to provide variety in errors so that participants cannot strategize and listen to only part of the sentence); EMnadj = Error Monitoring number adjacent error with the suffix on the adjacent monster word mismatching the other suffixes in the sentence in its number (ok and oko swapped, us and usu swapped), EMnnadj = Error Monitoring number nonadjacent error with the suffix on the nonadjacent verb mismatching the other suffixes in the sentence in its number, EMsadj = Error Monitoring semantic adjacent error with the suffix on the adjacent monster word mismatching the other suffixes in the sentence in its semantic nice/scary monster category (us and ok swapped, usu and oko swapped), EMsnadj = Error Monitoring semantic nonadjacent error with the suffix on the nonadjacent verb mismatching the other suffixes in its semantic nice/scary monster category (used for the suffix agreement error test); EMprob = Error Monitoring probable correct sentence with a probable monster/pattern combination, EMimpr = Error Monitoring improbable correct sentence with an improbable monster/pattern combination (used for the probabilistic co-occurrence test, not reported in CUNY abstract, reported in supplementary online materials in submitted abstract)
- key: l = participant pressed l key to indicate grammatically correct sentence, participant pressed f = f key to indicate grammatically incorrect sentence, keys on keyboard marked with isnot (neq) and is (=) stickers.
- wassoundcorrect?: 1 = the auditory sentence was grammatical, 0 = the auditory sentence was ungrammatical.
- critword: place of the first word in the sentence that was incorrect (either because it was in the wrong place in the sentece or because its suffix did not match the other suffixes in the sentence), 8 = correct sentence, but it is only possible to know a sentence is fully correct after having heard the entire sentence. Target words are 2 for EMsw1 and EMsw4 (monster word should not be in 2nd place), 3 for EMsw2 (landscape word should not be in 3rd place), 5 for EMsw3 (verb should not be in 5th place), 3 (monster) for EMnadj and EMsadj, 6 (verb) for EMnnadj and EMsnadj and 8 for EMprob and EMimpr which are fully grammatical.
- sound: normal for correct trials (EMprob and EMimpr). Error trials are generated for each participant separately by concatenating the relevant sound files when the psychopy program is started. They actual sound files for error trials are thus not included on this repository, but from the file name it is possible to deduce the single word sound files were concatenated to create each errortrial. For agreement error trials EMnadj, EMnnadj, EMsadj and EMsnadj the filenames work in the exact same way as the normal sound files (see explanation document in stimuli folder). For word order errors EMsw1, EMsw2, EMsw3, EMsw4 the sound file lists switch type and then lists the elements in the order they were in in the error trial. So, for example, an EMsw2 trial might have file name '2switch11111112122114212221.wav'. '2switch' indicates that a switch of type two happened, with order 1273456. Then everything up to and including the color word (2) is normal, so '111111121221', then the '1' indicates the landscape word, and then the rest after that '4212221' indicates the color word, pattern, preposition and verb in the same way it normally does (this description only makes sense if you understand first how normal soundfiles are named - see file in stimulus folder for that explanation).
Logs end with the number of trials participants got correct in each of the three tests. The first score is the score on the prescreen threshold (logged as 'total nr of correct vocabularytrials') test that was used to determine whether a participant's data was usable (15 or above) or unusable (14 or below).
Below that for comprehension participants the number of trials they got correct in each block of active comprehension trials is listed with the itemtype of that block. The bottom line of each file prints the total nr of trials per active comprehension block, since not all blocks had an equal amount of trials.