Home

Menu

Loading wiki pages...

View
Wiki Version:
<p><strong>What is BabbleCor?</strong></p> <p>BabbleCor is a crosslinguistic corpus of infant and child vocalizations from 52 children exposed to five different languages: English, Spanish, Tsimane', Yêlí-Dnye, Tseltal Mayan, and bilingual Quechua-Spanish. </p> <p><strong>How was BabbleCor created?</strong></p> <p>BabbleCor consists of very short audio clips (approximately 400ms) of child vocalizations. To generate these clips, each child first completed a daylong audio recording, between 6 and 16 hours in length, where a small, lightweight recorder was worn inside of a clothing pocket designed for the device. </p> <p>From these daylong recordings, child vocalizations were either identified by the proprietary Language ENvironment Analysis algorithm, which assigns utterances to speakers in naturalistic audio recordings (e.g. Female Adult, Child) or the vocalizations were identified by hand. 100 of the utterances identified as child vocalizations were randomly selected and chopped into the smaller clips in BabbleCor.</p> <p><strong>Where do the BabbleCor clip annotations come from?</strong> </p> <p>Each short clip (~400ms) was categorized according to a 5-way scheme by citizen science annotators on the iHEARu PLAY platform (<a href="https://www.ihearu-play.eu/" rel="nofollow">https://www.ihearu-play.eu/</a>). Annotators classified clips as 1) canonical - containing a consonant to vowel transition, 2) non-canonical - not containing a consonant to vowel transition, 3) crying, 4) laughing, or 5) junk. </p> <p>For further details on corpus creation, please see Methods described in <a href="https://psyarxiv.com/9vzs5/" rel="nofollow">Cychosz et al. (submitted)</a> available on PsyArXiv. </p> <p><strong>What are the metadata?</strong></p> <p>There are two metadata components in BabbleCor: <em><a href="https://osf.io/2n456/" rel="nofollow">Annotation_Tags</a></em> and <em><a href="https://osf.io/rau7f/" rel="nofollow">Public_Metadata</a></em>. As the name suggests, <em>Public_Metadata</em> includes corpus metadata that is publicly available to all corpus users: child ID, child age, child's assigned gender, corpus of origin, and clip ID. <em>Annotation_Tags</em> contains the annotation tags for each clip ID, such as canonical babble, laughing, etc. For access to the annotation tags, please sign, scan, & email the data sharing agreement to babblecorpus@gmail.com (see <em><a href="https://osf.io/64puz/" rel="nofollow">Data_Sharing_Agreement</a></em>). </p>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.