Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# Description of the dataset ## Structure | Folder | Content | |---|---| | alignments | Tir receptor family MSA | | anchor | Anchor motif predictions | | disopred | DISOPRED disorder predictions | | disopred_agg-clas | DISOPRED aggregation and classes | | fasta | Sequence collections from UniProt | | figures | Plots derived from analysis | | iupred | IUPred 1.0 disorder predictions | | iupred_agg-clas | IUPred 1.0 aggregation and classes | | maps | Species and taxa in FASTA collection | | motif_vs_disorder | Merged data from anchor and aggregated DISOPRED | ## Logic **The code for data processing can be found in [this repository](https://osf.io/cxkjf/)** Sequences were fetched from UniProt and sorted in collections under `fasta`. Three effectors collections were assembles, *E. coli* EHEC, *E. coli* EPEC, and *C. rodentium*. For each one of them, the corrresponding taxon an specie name was extracted. The resulting dictionaries were saved under `maps`. The taxon lists were used to fetch available UniProt reference proteomes for each collection. As a reference, the human proteome was also collected. All those sequence collections are also found under `fasta`. Then, each collection was processed using IUPred 1.0 *short* and *long* modes and DISOPRED 3.1.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.