# Quantifying Bias in Hierarchical Category Systems
Code to analyze western and gender bias as the category and item (book) level in both the Library of Congress Classification (LCC) and the Dewey Decimal Classification (DDC). An analysis of bias towards domestic mammals versus wild mammals in WordNet is also included.
The analyses are presented in Warburton, K., Kemp, C., Xu, Y., & Frermann, L. (2024). Quantifying Bias in Hierarchical Category Systems. *Open Mind: Discoveries in Cognitive Science. Advance publication.* https://doi.org/10.1162/opmi_a_00121
## Required Python Libraries
### For main analyses:
- numpy
- scipy
- matplotlib
- tabulate
- statistics
- pickle
### Additional Libraries
#### For parsing MARC Records:
- pymarc
#### For parsing author gender data:
- regex
- pyarrow
- polars
- pandas
#### For scraping LibraryThing, Wikipedia, and BabelNet
- requests
- BeautifulSoup
#### For WordNet analysis:
- nltk
## Attributions
All copyright rights in the Dewey Decimal Classification system are owned by OCLC. Dewey, Dewey Decimal Classification, DDC and WebDewey are registered trademarks of OCLC. This project contains information from [OhioLINK Circulation Data](https://www.oclc.org/research/areas/systemwide-library/ohiolink/circulation.html) which is made available by OCLC Online Computer Library Center, Inc. and OhioLINK under the [ODC Attribution License](https://www.oclc.org/research/areas/systemwide-library/ohiolink/odcby.html). LCC outlines are extracted from PDFs created by the Library of Congress and stored at: https://www.loc.gov/catdir/cpso/lcco/.