**Introduction**
This repository is made for the digitised dataset of Yue and Southern Pinghua dialects spoken in Southern China. In addition, there are three python scripts for the conversion from Chao's (1930) tone representation to other notations, namely *tone-to-string*, *Onset-Contour-Offset (OCO)* and *modified Onset-Contour-Offset (mOCO)*.
This repository serves as the supplemetary material for the conference paper *Sung, Ho Wang Matthew, Prokić, Jelena & Chen, Yiya (2024). "A New Dataset for Tonal and Segmental Dialectometry from the Yue- and Pinghua-Speaking Area". In Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 25–36, St. Julian's, Malta. Association for Computational Linguistics.*
For more information on how the data were compiled, please refer to the paper.