Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# **Welcome** Welcome to the Library research session for Dr. Martina Wiltschko’s LING 452 (Acquisition of Syntax) class, Winter Term 1, 2018. In this second half of this workshop, we’ll learn how to find and analyse first language acquisition data using the [CHILDES database of children's speech][1]. In this section on CHILDES, we'll: - learn ***about*** the CHILDES database - look at ***how to access*** the data, and, - ***get practice*** diving into the datasets. # **WHAT** is CHILDES? CHILDES stands for **C**hild **L**anguage **D**ata **E**xchange **S**ystem. The [CHILDES database][2] contains transcripts and media data (audio and video files) collected from conversations with children. It is part of larger system called **[TalkBank][3]**. TalkBank is a system for sharing and studying conversational interactions; it is the largest open repository of data on spoken language. ![childes homepage][4] # **WHO** is it? It's important to look at the WHO of CHILDES so we can properly credit and build on their work. Established in 1984 in the Department of Psychology at Carnegie Mellon University by Dr. Brian MacWhinney and Dr. Catherine Snow, the goal of CHILDES is to make transcripts and recordings of child language acquisition available to researchers as **free, public datasets**. > [MacWhinney, B. (2000). The CHILDES Project: Tools for Analyzing Talk. 3rd Edition. Mahwah, NJ: Lawrence Erlbaum Associates.][5] "CHILDES now archives tens of thousands of transcripts and associated media across 20+ languages, making it a critical resource for characterizing both children’s early productive language use and their language environment. As the first major effort to consolidate and share transcripts of child language, CHILDES has been a pioneer in the move to curate and disseminate large-scale behavioral datasets publicly." > [Sanchez, A., Meylan, S., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2018, April 23). childes-db: a flexible and reproducible interface to the Child Language Data Exchange System.][6] # **WHY** use CHILDES? Using a large scale corpus of multi-language acquisition data allows us to answer questions about language acquisition - Do (x-year-old] children know Y? - What is the mean length of utterance of 2 year old speakers of Korean? - comparative studies - at what age do children acquire determiners across languages? # **The Tools** CHILDES consists of three separate, but integrated, tools: 1. **CHAT** (Codes for the Human Analysis of Transcripts) - transcription and coding format - the data is transcribed in CHAT format - [see the CHAT manual for more][7]! 2. **CLAN** (Computerized Language ANalysis) - data analysis program - CLAN is designed specifically to analyze data transcribed in the CHAT format - [see the CLAN manual for more][8]! 3. the database # **HOW** to Access the Data There are several ways to access the data (in varying levels of complexity and functionality). 1. **Using the Browsable Database** - playback transcripts with linked media directly from your browser - use the command line interface within the browser to query the data 2. **Downloading Transcripts and Media** - to study transcripts more closely, download them rather than playing through the Browsable Database 3. [**LuCiD toolkit**][9] (Chang, 2017) - provides related functionality for a number of common analyses. - Fills gaps not covered by CLAN – e.g., the use of n-gram models, incremental sentence generation, and distributional word classification 4. [**Childes.db web apps**][10] (Sanchez et al, 2018) - web apps focusing on the same common tasks as CLAN, but making the outputs into browsable visualizations For this workshop, we'll focus on [using the CHILDES browsable database][11]. ------ ## **Digging into the Data** Let's go dive into the data. [In the next section][12], we'll explore the available transcripts and related media files for a Catalan-speaking child using the CHILDES browsable database. Then we'll learn how to run queries using the command line and CLAN. - [Finding, Understanding, and Querying the data in the CHILDES Browsable Database][13] [1]: https://childes.talkbank.org/ [2]: https://childes.talkbank.org/ [3]: https://talkbank.org/osf.io/vrdnj/ [4]: https://files.osf.io/v1/resources/zqb6c/providers/osfstorage/5b9389822c77f700157e863f?mode=render [5]: https://talkbank.org/manuals/CHAT.pdf [6]: http://psyarxiv.com/93mwx [7]: https://talkbank.org/manuals/CHAT.pdf [8]: https://talkbank.org/manuals/CLAN.pdf [9]: http://childes-db.stanford.edu/index.htmlisualizations. [10]: https://childes.talkbank.org/browser/alizations. [11]: https://osf.io/zqb6c/wiki/Hands-On:%20Using%20the%20CHILDES%20Browsable%20database/ttps://childes.talkbank.org/browser/alizations. [12]: https://osf.io/zqb6c/wiki/Using%20the%20CHILDES%20Database/ [13]: https://osf.io/zqb6c/wiki/Using%20the%20CHILDES%20Database/
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.