Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
This project lists code and models for '*How do the kids speak? Improving educational use of text mining with child-directed language models*'. In the paper, we found that text mining models trained on child-directed text - from Youtube, television shows, Simple English Wikipedia, and childrens' books: 1. performed better on automated originality scoring of children's creativity, and 2. exhibited lower gender and racial biases. The models, in addition to being presented here, are also hosted on a server at: https://openscoring.du.edu/data/all_weighted_10-12_100k.kv https://openscoring.du.edu/data/all_weighted_10-12_100k.kv.vectors.npy They are in the Gensim KeyedVectors format, and can be converted to other formats with that library. An example of online use of the models is in: https://github.com/massivetexts/motes-corpus/blob/master/analysis/MOTESCorpusBiasAnalysisAndComparison.ipynb. > Organisciak, P., Newman, M., Eby, D., Acar, S. and Dumas, D. (2023), "How do the kids speak? Improving educational use of text mining with child-directed language models", Information and Learning Sciences, https://doi.org/10.1108/ILS-06-2022-0082 If you have questions, email me at peter.organisciak@du.edu, and I'll try to add documentation as questions come up.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.