SLiMSuite example data, May 2019

doi:10.17605/OSF.IO/8DTQ5

Title	Authors

Home

## Example data for SLiM Discovery The attached data can be used for testing the SLiM discovery tools of [SLiMSuite](https://github.com/slimsuite/SLiMSuite/). ``` 315B POL30.Scer.apid-hq.acc 697K POL30.Scer.apid-hq.dat 35K POL30.Scer.apid-hq.fas 74M uniprot.yeast.147537.2019-05-14.fas.gz 50K elm2019.motifs 69K elm2019.split.motifs 39K elm2019.reduced.motifs 210B LIG_PCNA_PIPBox_1.motif ``` Example usage can be found in the [SLiMSuite CookBook](https://github.com/slimsuite/SLiMSuite/wiki/docs_md/CookBook.md). ### POL30 interactors The `POL30.Scer.apid-hq.*` files are three different formats of "high quality" interactors with _Saccharomyces cerevisiae_ protein POL30 (human PCNA orthologue) from the [APID](http://cicblade.dep.usal.es:8080/APID/init.action) interaction database: * `*.acc` = Uniprot accession numbers. * `*.dat` = Full uniprot flat file. * `*.fas` = Protein fasta file in [SLiMSuite format](http://slimsuite.blogspot.com/2015/10/file-format-fasta-seqfile-fasfile.html). These represent alternative input formats for the same data. ### Yeast proteomes The larger `uniprot.yeast.147537.2019-05-14.fas.gz` file is a gzipped fasta format of yeast proteomes downloaded from Uniprot on `2019-05-14`. This are all Uniprot proteomes for the [Saccharomycotina (true yeasts)](https://www.uniprot.org/taxonomy/147537) subphylum (TaxID:147537). Two proteomes with non-specific (`9XXXX`) species codes have been filtered out. These data are provided for using [GOPHER](http://rest.slimsuite.unsw.edu.au/gopher) to generate predicted orthologue alignments for conservation masking. ### ELM Data [ELM](http://elm.eu.org) motif classes (downloaded `2019-05-02`) are provided in the `elm2019.motifs` file. The `elm2019.split.motifs` file contains the same data split into different motif variants. SLiMSuite will generate this file when required if it does not already exist. `elm2019.reduced.motifs` contains a "reduced" set of ELM class definitions, generated using [SLiMMaker](http://rest.slimsuite.unsw.edu.au/slimmaker) as described in the [QSLiMFinder paper](https://www.ncbi.nlm.nih.gov/pubmed/25792551?dopt=Abstract) by aligning ELM instances for that motif and then extracting a regular expression motif pattern from the alignment. Reduced ELM definitions lose a lot of the complexity and curator-derived knowledge. They are primarily useful as a test dataset of "True Positive" motifs that should be recoverable from sets of ELM instance proteins, but may also be useful for cleaner [CompariMotif](http://rest.slimsuite.unsw.edu.au/comparimotif) screens for known motif due to their reduction in complexity. `LIG_PCNA_PIPBox_1.motif` contains a single motif for simplified examples. ### References **Gouw M1 et a;.** The eukaryotic linear motif resource - 2018 update. [_Nucleic Acids Res._ 46:D428-D434 (2018)](https://www.ncbi.nlm.nih.gov/pubmed/29136216) **The UniProt Consortium.** UniProt: a worldwide hub of protein knowledge. [_Nucleic Acids Res._ 47: D506-515 (2019)](https://doi.org/10.1093/nar/gky1049)

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.