Data were collected from children with a sex chromosome trisomy and/or autism, and their families. Families completed two questionnaires (the Children's Communication Checklist and the Pediatric Symptom Checklist) and took part in a telephone interview using the 3Di-sv autism assessment. Children completed a test battery including five novel language tasks and a nonverbal matrix reasoning task. The test battery could be completed remotely or with a researcher. Where a researcher met with the young person, a conversation sample was collected.
The data are reported in **Wilson & Bishop (2020). Does the autism phenotype differ when selecting groups by neurodevelopmental versus genetic diagnosis? A comparison between autism and sex chromosome trisomy.**
This paper considers the results for the language battery and 3Di across groups. Scripts relevant to this paper are MainAnalysis.R and MakePlots.R.
Alongside the analyses reported in this paper, we developed a new measure to assess the conversation samples collected from young people involved in the project. This new measure is called the Observational Assessment of Conversation Scale (OACS): see OACS.docx for the rating scale and OACS.csv for the associated data. This analysis has not been formally written up for publication.
Data files include:
PARENT SURVEY
**CCC.csv** Children's Communication Checklist (Bishop, 1998), a parent-report questionnaire assessing structural and pragmatic aspects of language, as well as autistic features. File presents item-level responses to all items; then number of unscored items for the structural, pragmatic, social and interests subscales; then total scores on each subtest. Total raw scores and prorated total scores are given (where parents left up to 20% of items unscored for a subscale, scores were prorated by the mean score for the rest of the items in that subscale; where more than 20% of items were unscored, that subscale was recorded as invalid. Item-level responses are coded as follows: 0=does not apply; 1=applies somewhat; 2=definitely applies; NA=don't know/can't judge. Note that item-level scores have been reverse coded for positively-worded items.
**PSC17.csv** Pediatric Symptom Checklist, a parent-report questionnaire of general psychosocial impairment (https://www.massgeneral.org/psychiatry/treatments-and-services/pediatric-symptom-checklist). In addition to total scores, factor analysis supported the extraction of subscores for attention (5 items), internalising (5 items) and externalising difficulties (7 items; Gardner et al., 1999). For each of these subscales, the file presents item-level responses followed by two summary scores: subscale total and positive screen for impairment on that subscale. At the end of the file, total scores for the whole scale, positive screen for impairment on the whole scale, and number of missing items are shown. Where any items are left unscored for a subscale, total score for that subscale is recorded as invalid for that participant; for the categoric variables (impaired or not impaired), scores are provided if the participant would necessarily have screened positively or negatively however the missing items were answered. Where three or fewer items are left unscored on the whole scale, total scores are calculated ignoring those unscored items; the whole scale is recorded as invalid if a participant leaves more than three items unanswered. Item-level responses as coded as follows: 0=never; 1=sometimes; 2=often; NA=unanswered.
**participant.information.csv** File presents for each child in the following order: their study ID; gender; age at the time the parent completed the survey; the UK school year group they belong to based on their birthday; whether they completed most/all of the test battery (1=most/all tests completed; 0=one or fewer tests completed); whether they have an autism diagnosis; whether they have a diagnosis of a sex chromosome trisomy (SCT); whether an SCT diagnosis was prenatal (0=postnatal; 1=prenatal; NA=no trisomy diagnosis); the type of trisomy; whether they have been diagnosed with a language or literacy disability (e.g. SLI, DLD, dyslexia, etc.); and their level of SEN (special educational needs) support (0=none; 1=some low-intensity support in mainstream school; 2=high intensity support in mainstream school; 3=attends special school; NA=home-schooled).
PARENT INTERVIEW
**3di.csv** The developmental, diagnostic and dimensional (3di) assessment was administered over the phone (Santosh et al., 2009). The file presents scores on the Social Interaction, Communication, and Repetitive and Restrictive Behaviour and Interests (RRBIs) subscales (the Nonverbal Communication score is based on a subset of the Communication items). Autism caseness is shown in the final column; 3Di criteria for autism are a Social Interaction score of 10 or over, plus a Communication score of 8 or over and/or an RRBI score of 3 or above.
TEST BATTERY
(See "Wilson and Bishop (In publication). A novel online assessment of pragmatic and core language skills: An attempt to tease apart language domains in children. *Journal of Child Language*" for normative data and psychometric analysis of the test battery in almost 400 non-autistic children. Please note that a small number of items were dropped from the tests based on this psychometric analysis, and data for these items are not given in the files uploaded here. In addition, z-scores have been computed for some of the tests, and are included in files with "z-score conversion" in the title.)
**Grammar.csv**
Receptive Grammar task, in which participants listen to sentences and decide if they are grammatical. They hear the following instructions: “Some of the sentences will sound good, but some of the sentences will sound bad. There might be a missing word. Or the wrong word might be used. Or the order of the words might be weird. If the sentence is good, click the green tick. If the sentence is bad, click the red cross”. There are 50 items: in 4 sentences the words are in a random order and should be easily rejected, 20 items are taken from McDonald (2006) and showed high accuracy in primary school children, and 26 items are a subset of our adult version of this test (Wilson and Bishop, 2019); these latter items were chosen on the basis of high accuracy and high item-total correlations. Excluding the 4 randomly ordered sentences, 23 items do not follow typical syntax or use incorrect word forms (e.g. incorrect tenses) and 23 follow typical English grammar. Examples of incorrect items include: “The teacher told the story the children” and “I went out after I have eaten dinner”. There was one measured variable: the sum of items currently answered (out of 50). The file presents item-level accuracy and total scores.
**Vocab.csv** Receptive Vocabulary task, in which participants choose which of four pictures is related to a word. Participants hear a sequence of 39 words and for each word, they are presented with four pictures on the screen. They are asked to “chose which picture goes best with the word”. The words include nouns, verbs and adjectives, and vary in approximate age of acquisition from 5 to 12, with similar numbers of easy and harder words; two experienced teachers independently rated the ages at which they would expect 50% and 90% of children in a typical class to be familiar with the word. There was one measured variable: the sum of items correctly answered (out of 39). The file presents item-level accuracy and total scores.
**Implicature.csv** Implicature Comprehension Test, in which participants watch a series of cartoon videos, in each of which two characters produce a short utterance one after the other. Together the utterances form a conversational adjacency pair; in most cases, this is a question and answer. After this dialogue, participants hear a comprehension question, and they give a yes-no-don’t know response by clicking buttons on the screen. For 33 items, participants need to process implied meaning to answer the question, as the second character provides an indirect response to the first character. An example item includes:
Character 1: “Could you hear what the police said?” Character 2: “There were lots of trains going past.” Comprehension Question: “Do you think she heard what the police said?” Correct Answer: “No.”
There are also 10 items where the answer is more explicit; these serve as positive control items. An example item includes:
Character 1: “Did you see the policemen earlier on?” Character 2: “I saw them standing on the platform.” Comprehension Question: “Do you think he saw the policemen?” Correct Answer: “Yes.”
From these items, there were two measured variables: sum of implicature items correctly answered (out of 33) and sum of explicit-response control items correctly answered (out of 10). The file presents item-level accuracy and total scores.
**Inference.csv** Children's Test of Local Textual Inference, in which participants hear two brief sections of a short story (about 90 words per part). After each section, they hear ten questions and four possible answers for each one. “We don’t know” is an answer option for every question, and is the correct answer to four questions. Participants click the correct option on the screen. As well as auditory presentation of all materials, everything is shown in text-based form on the screen. Participants are informed at the start that the short story sections will remain on the screen while they are answering questions about that section. Participants need to make inferences based on the short story to answer the questions. The short story starts as follows: “Unfortunately, the family couldn’t go swimming. The sea was rougher and colder than expected. Instead, Billy spent the whole morning playing a ballgame with his sister, Susie.” An example question is: “What had Billy planned to do?” Participants chose their answer from the following options: “play a ballgame”, “go swimming”, “walk along the sea”, and “we don’t know”. There was one measured variable: the sum of items correctly answered (out of 20). The file presents item-level accuracy and total score.
**Overtures.csv** Social Overtures task, in which participants hear 23 utterances spoken by a character to a conversational partner. Eleven are social overtures that attempt to engage the partner in a conversation (e.g. “I can’t believe what happened today.”) and twelve are not conversational bids (e.g. “I’m going to have a shower now.”). Participants listen to instructions explaining that “There are different reasons why we say things to other people. Sometimes, we want to start a conversation. We want the other person to ask us questions and say lots of things to us. Other times we just want to tell the other person something very quickly. We don’t always want to start a long conversation.” They are then asked for each sentence whether the speaker wants a conversation or not, and to indicate their answer by clicking yes-no buttons. There was one measured variable: the sum of items correctly identified as a social overture or not (out of 23). The file presents item-level accuracy and total scores.
**Matrices.csv** Animal Matrices nonverbal reasoning task, in which participants are presented with a sequence of 16 2x2 matrices on the computer screen. In three of the boxes of each matrix, there are cartoon pictures of animals, and the fourth box is empty. The animals in the three boxes vary along six dimensions: species, colour, size, number, direction faced, and position in the box. There are systematic relationships between the three animals, and participants need to deduce which of five options fits in the empty box. For example, the top two boxes may show red lions, one big and one small, and the bottom left box may show a big yellow horse; the correct option to fill the empty box would be a small yellow horse. There was one measured variable: the sum of items correctly answered (out of 16). The file includes item-level accuracy and total scores.