Loading wiki pages...

Wiki Version:
<p>This is a data repository for a large study that involved the analysis and prediction of homotypic (self-self) transmembrane domain interactions.</p> <p>Publication: - Yao Xiao, Bo Zeng, Nicola Berner, Dmitrij Frishman, Dieter Langosch, Mark George Teese - Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces - Computational and Structural Biotechnology Journal - Volume 18, 2020, Pages 3230-3242 - ISSN 2001-0370 - <a href="https://doi.org/10.1016/j.csbj.2020.09.035" rel="nofollow">https://doi.org/10.1016/j.csbj.2020.09.035</a></p> <p>Contributors: - Yao Xiao - Bo Zeng - Mark Teese - Dieter Langosch - Dmitrij Frishman</p> <p>Contact:</p> <ul> <li>Mark Teese</li> </ul> <p>Affiliation:</p> <ul> <li><a href="https://www.tum.de/en/" rel="nofollow">Technical University of Munich</a></li> <li><a href="https://www.tngtech.com/en/index.html" rel="nofollow">TNG Technology Consulting GmbH</a></li> </ul> <p>Related website with machine-learning tool: - <a href="http://www.thoipa.org" rel="nofollow">www.thoipa.org</a></p> <p>Related open-source software repositories:</p> <ul> <li><a href="https://github.com/bojigu/thoipapy" rel="nofollow">THOIPApy code repository</a></li> <li><a href="https://bitbucket.org/yaoxiaorepos/datoxr" rel="nofollow">datoxr code repository</a></li> <li><a href="https://github.com/teese/pytoxr" rel="nofollow">pytoxr code repository</a></li> </ul> <p>Open Science Foundation Repository Contents:</p> <ul> <li>data</li> <li>THOIPA_data.7zip<ul> <li>homologues (BLAST data files and alignments)</li> <li>interface_predictions (predictions from THOIPA, PREDDIMER, TMDOCK used for validation)</li> <li>interface_residues (experimental data on TM homodimer interfaces from NMR, ETRA, and crystal structure experiments)</li> <li>residue_properties (data on conservation, polarity, coevolution etc calculated for each residue of each TMD in each dataset)</li> <li>THOIPA_validation (raw validation data (ROC AUC, etc) for the THOIPA machine learning predictor. Also contains the machine-learning model, training_data, and feature importances)</li> </ul> </li> <li>protein_lists<ul> <li>[list of proteins in homotypic TM dataset, and also individual datasets. includes sequences in fastA format]</li> </ul> </li> <li>figures</li> <li>DDR2 results and other scanning mutagenesis data</li> <li>methods</li> <li>hydrophobicity scales and other data related to methods</li> </ul> <p>Data notes:</p> <ul> <li>The following sets of proteins are included in the protein_lists folder</li> <li>set05 : homotypic TMD dataset (combined ETRA, NMR, X-ray)</li> <li>set07 : test data for machine learning</li> <li>set08 : train data for machine learning</li> <li>folders labelled "old, deprecated data" refer to an older machine-learning model, trained on a slightly modified set05.</li> <li>the hierarchical data structure in THOIPA_data.7zip should in most cases be self-explanatory. Also, references and code for the processing of each file can be found in <a href="https://github.com/bojigu/thoipapy" rel="nofollow">thoipapy software</a> version 1.1.3. Most data can be recreated using the open-source thoipapy software.</li> </ul>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.