Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
Complete documentation on how to run SuperCRUNCH is available at: https://github.com/dportik/SuperCRUNCH/wiki The following generalized commands were used to identify problematic taxon labels, and subsequently relabel them. Note that the `/Path/To/` should be changed to the actual file locations if you should run this. The `Anura-GenBank-July_2_2022.fasta` contains all the anuran sequences I downloaded from GenBank. Assess taxonomy: ``` python Taxa_Assessment.py -i /Path/To/Anura-GenBank-July_2_2022.fasta -t /Path/To/Taxon-List-Anura-July_2_2022-outgroups.txt -o /Path/To/1-Assess/output ``` This yielded the following information: ``` -------------------------------------------------------------------------------------- Parsing taxon information from: 'Taxon-List-Anura-July_2_2022-outgroups.txt' Found 7,579 species names and 0 subspecies names. -------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------- Gathering records with matched names. Starting SQL queries. Searching for species names... Finished. Found 653,712 sequences with matched taxon names. Found 4,945 unique matched species (two-part) names. Found 0 unique matched subspecies (three-part) names. -------------------------------------------------------------------------------------- Gathering records with unmatched names. Starting SQL queries. Searching for unmatched species names... Finished. Found 77,161 sequences with unmatched taxon names. Found 5,327 unique unmatched species (two-part) names. Found 0 unique unmatched subspecies (three-part) names. -------------------------------------------------------------------------------------- Writing output file: Matched_Taxa.fasta Writing 653,712 sequence records. Elapsed time: 0:02:07.140211 (H:M:S) Writing output file: Unmatched_Taxa.fasta Writing 77,161 sequence records. Elapsed time: 0:00:03.337362 (H:M:S) -------------------------------------------------------------------------------------- Finished. Total elapsed time: 0:04:09.090815 (H:M:S) -------------------------------------------------------------------------------------- ``` Relabel sequences: ``` python Rename_Merge.py -i /Path/To//1-Assess/output/Unmatched_Taxa.fasta -r /Path/To/Relabeling-Key-Anura.txt -o /Path/To/2-Relabel/output -m /Path/To/1-Assess/Path/To/Matched_Taxa.fasta ```
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.