## English Dialects App ##
This is a subset of recordings made by mobile phone users when they were using the English Dialects App. During the course of their interaction with the App, the users were asked to record themselves reading a version of ‘the Boy who Cried Wolf’ which consists of 10 sentences. The full corpus consists of recordings of many thousands of speakers from all over the world.
SPADE We were given access to all of the recordings, however this would have been a large task to align, and not all speakers were needed. Separately, Ben Gittelson had MAUS-aligned 5 of the 10 sentences from a large proportion of the British English recordings and kindly shared these TextGrids with the SPADE project. The reading text can be found in the 'Files' section for this dataset.
We used the Unisyn accents as the basis for splitting up the English Dialects App recordings into broad dialect areas. Derived measures datasets are posted for each of these areas: dapp-Ireland, dapp-Scotland, dapp-Scotland-NE, dapp-Wales, dapp-England-Leeds, dapp-England-RP.
**Number of Speakers (in this subset):** Approximately 3100, approx 2100F \
**Hours of Speech:** about 26 \
**Year Recorded:** 2016-2017 \
**Data Guardian:** Adrian Leemann \
**Speaker Dimensions:** id, phrase, recording_id, town name, country, precise ethnicity, ethnicity, no of times moved in the past 10 years, distance travelling to work or school, highest qualification, completed/will complete full time education at (age), gender, age, whether participant has submitted this information before, whether parents attend university, county name 1, county name 2, UK country, dialect area, unisyn dialect
### Corpus Reference ###
Leemann, A., Kolly, M-J., & Britain, D. (2018). The English Dialects App: the creation of a crowdsourced dialect corpus. Ampersand 5, 1-17.