Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
## Santa Barbara Corpus ## The [Santa Barbara Corpus][1] is a corpus of spoken American English from across the U.S., with recordings of mostly naturally occurring speech and conversations. It was aligned as part of the SPADE project using MFA. **Number of Speakers:** about 200 \ **Hours of Speech:** about 27 \ **Year Recorded:** late 90s-early 2000s (Part 1 was completed in 2000, Part 4 in 2005) \ **Data Guardian:** available for download online and also licensed by the Linguistic Data Consortium. \ **Speaker Dimensions:** gender, age, home town, ethnicity, social class (information on education and occupation). Metadata files for the corpus can be downloaded [here] [2], which contain all metadata for each speaker. ### Corpus Reference ### Du Bois, John W., Wallace L. Chafe, Charles Meyer, Sandra A. Thompson, Robert Englebretson, and Nii Martey. 2000-2005. Santa Barbara corpus of spoken American English, Parts 1-4.Philadelphia: Linguistic Data Consortium. [1]: http://www.linguistics.ucsb.edu/research/santa-barbara-corpus [2]: http://www.linguistics.ucsb.edu/sites/secure.lsit.ucsb.edu.ling.d7/files/sitefiles/research/SBC/metadata.zip
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.