Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# List of Files: * **gather_features.py**: Derive a set of features for each name based on number of occurrences in the Leipzig Wortschatz, base form of the name without diacritics (derived using the python unidecode package version). * **filter_diacritics.py**: Filtering of names based on base form without diacritics. * **filter_soundcodes.py**: Filtering of names based on sound and spelling. For spelling the jaro-winkler similarity from the python package jellyfish is used. * **sort_names.py**: Sorting of the names based on number of occurrences. # Package Versions Used During Filtering All package versions are collected using the "pip show" command. ## Package: unidecode Metadata-Version: 2.0 Name: Unidecode Version: 0.4.20 Summary: ASCII transliterations of Unicode text Home-page: UNKNOWN Author: Tomaz Solc Author-email: tomaz.solc@tablix.org License: GPL Location: /usr/lib64/python3.4/site-packages Requires: ## Package: jellyfish Metadata-Version: 1.1 Name: jellyfish Version: 0.5.6 Summary: a library for doing approximate and phonetic matching of strings. Home-page: http://github.com/jamesturk/jellyfish Author: UNKNOWN Author-email: UNKNOWN License: UNKNOWN Location: /usr/lib64/python3.4/site-packages Requires:
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.