## Santa Barbara Corpus ##
The [Santa Barbara Corpus][1] is a corpus of spoken American English from across the U.S., with recordings of mostly naturally occurring speech and conversations. It was aligned as part of the SPADE project using MFA.
**Number of Speakers:** about 200 \
**Hours of Speech:** about 27 \
**Year Recorded:** late 90s-early 2000s (Part 1 was completed in 2000, Part 4 in 2005) \
**Data Guardian:** available for download online and also licensed by the Linguistic Data Consortium. \
**Speaker Dimensions:** gender, age, home town, ethnicity, social class (information on education and occupation). Metadata files for the corpus can be downloaded [here] [2], which contain all metadata for each speaker.
### Corpus Reference ###
Du Bois, John W., Wallace L. Chafe, Charles Meyer, Sandra A.
Thompson, Robert Englebretson, and Nii Martey. 2000-2005. Santa
Barbara corpus of spoken American English, Parts 1-4.Philadelphia:
Linguistic Data Consortium.
[1]: http://www.linguistics.ucsb.edu/research/santa-barbara-corpus
[2]: http://www.linguistics.ucsb.edu/sites/secure.lsit.ucsb.edu.ling.d7/files/sitefiles/research/SBC/metadata.zip