## **Freshwater Virome** ##
This page is a repository of freshwater viral contigs created from more than one **terabyte** of freshwater virome data, after rigorous decontamination and selection steps.
## **README** ##
File description is as follows:
1. all_viral_contigs.fna: fasta file containing all viral contigs identified by VIBRANT (273,365 contigs).
2. all_viral_contigs_ORFs.faa: fasta file of called open reading frames (ORFs) of all viral contigs (2,119,105 ORFs).
3. all_viral_contigs_ORFs_annotaion_table.txt: tab-separated annotation table of called ORFs of all viral contigs
4. build_clusters.py: cutsom Python script that combines vcontact2 clusters from different files into a [networkx][1] graph and builds new clusters accordingly.
5. viral_clusters.txt: a tab-separated file for all clusters generated by the custom script from all vcontact2 results.
6. vOTUs_greter_than_equal_10kb.fna: representatives of vOTUs generated by clustering viral contigs greater than or equal to 10 kb with 95% average nucleotide identity and 85% coverage (referred to as alignment fraction, AF) of the shorter sequence.
## **Citation** ##
If you use this resource in your research, please cite:
- **Elbehery, A. H. A., and Deng, L. (2022). Insights into the global freshwater virome. *Frontiers in Microbiology* 13.
[doi:10.3389/fmicb.2022.953500][2].**
[1]: https://networkx.org/documentation/networkx-2.4/
[2]: https://doi.org/10.3389/fmicb.2022.953500