ManDi: Mandarin Chinese Dialect Corpus

doi:None

Title	Authors

Home

The ManDi Corpus is a spoken corpus of regional Mandarin dialects and Standard Mandarin. The corpus currently contains a total of 357 recordings from 36 speakers of six Mandarin dialects. The speakers recorded production of monosyllabic words, disyllabic words, short sentences, a short passage *North Wind and the Sun* and a Chinese modern poem *Wo Chun*, in Standard Mandarin and their own regional dialect--one of six regional Mandarin dialects, i.e. Beijing, Chengdu, Jinan, Taiyuan, Wuhan, and Xi’an Mandarin. The corpus was collected remotely using participant-controlled smartphone recording apps. Word- and phone-level alignments were generated using Praat and the Montreal Forced Aligner. The paper "*The ManDi Corpus: A Spoken Corpus of Mandarin Regional Dialects*" has been submitted to the LREC 2022 conference and is now under review. We attached a preprint version of the paper here in the project.

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.