Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
“Cleaning” descriptive metadata is a frequent task in digital library work, often enabled by scripting or OpenRefine. But what about when the issue at hand isn’t an odd schema, trailing whitespace, or inconsistent capitalization; but pervasive racial or gender bias in the descriptive language? Currently, the work of seeking to remediate the latter tends to be highly manual and reliant on individual judgment and prioritization, despite their systemic nature. This talk will explore what using programming to identify and address such biases might look like, and argue that seriously considering such an approach is essential to equitably publishing digital collections on a large scale. I’ll discuss precedents and challenges for such work, and share two small experiments to this end in Python: one aided by Wikidata to replace LCSH terms for indigenous people in the U.S. with more currently preferred terminology, and another using natural language processing to identify where women are named as Mrs. [Husband’s First Name] [Husband’s Last Name].
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.