Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
![Image Provided Courtesy of Felipe H. Santiago-Tirado, colored by Kristina Davis, CC-BY 4.0][1] **Background** This OSF project contains data and information pertaining to **Automated Text Categorization and Metabarcoding Reinforces Cryptococcus Neoformans–Woody Decomposition Association** An article describing how Natural Language Processing (NLP) can be used to elucidate niche and function of semi-rare species. This work was started in 2016 as a collaboration between Benjamin Roche, then a post-doc at CSHL, and David Molik, a bioinformatics developer at CSHL. At the time the project was looking for a different species in a mountain of data, *Saitoella complicata*, which it never found. However in tuning the pipeline it looked for *Saccharomyces cerevisiae*, which if found in abundance, and *Cryptococcus neoformans*, which it found in a small number. David took the work with him when he went to the University of Notre Dame for graduate school, and played with some NLP tools in a class taught by his advisor Michael Pfrender and Jeff Feder, doing this on the papers associated with *Cryptococcus neoformans* seemed like a logical next step, and so he tried the NLP tooling; the results were promising. Dave brought on Shane Davit as an undergrad and over the course of the semester the Navari Family Center for Digital Scholarship was brought on board as well, through the work of Natalie Meyers and Eric Morgan. Finally DeAndre Tomlinson was brought on board through the work of a Data Science course, which ultimately polished the analysis. **How to use this OSF Project** This OSF project contains the original data from the metabarcode searches, some figures, and the code. Have a look around! - [A Generalized Pipeline Description][2] [1]: https://osf.io/82ngf/?direct&mode=render&action=download&public_file=False&initialWidth=848&childId=mfrIframe&parentTitle=OSF%20%7C%20KN99_10xA_005_colored-p-g-1.png&parentUrl=https://osf.io/82ngf/&format=400x400.jpeg [2]: https://osf.io/29v3f/wiki/Pipeline%20Description/
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.