Main content



Loading wiki pages...

Wiki Version:
This is a data repository for a large study that involved the analysis and prediction of homotypic (self-self) transmembrane domain interactions. Publication: - Yao Xiao, Bo Zeng, Nicola Berner, Dmitrij Frishman, Dieter Langosch, Mark George Teese - Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces - Computational and Structural Biotechnology Journal - Volume 18, 2020, Pages 3230-3242 - ISSN 2001-0370 - Contributors: - Yao Xiao - Bo Zeng - Mark Teese - Dieter Langosch - Dmitrij Frishman Contact: - Mark Teese Affiliation: - [Technical University of Munich][2] - [TNG Technology Consulting GmbH][6] Related website with machine-learning tool: - Related open-source software repositories: - [THOIPApy code repository][3] - [datoxr code repository][4] - [pytoxr code repository][5] Open Science Foundation Repository Contents: - data - THOIPA_data.7zip - homologues (BLAST data files and alignments) - interface_predictions (predictions from THOIPA, PREDDIMER, TMDOCK used for validation) - interface_residues (experimental data on TM homodimer interfaces from NMR, ETRA, and crystal structure experiments) - residue_properties (data on conservation, polarity, coevolution etc calculated for each residue of each TMD in each dataset) - THOIPA_validation (raw validation data (ROC AUC, etc) for the THOIPA machine learning predictor. Also contains the machine-learning model, training_data, and feature importances) - protein_lists - [list of proteins in homotypic TM dataset, and also individual datasets. includes sequences in fastA format] - figures - DDR2 results and other scanning mutagenesis data - methods - hydrophobicity scales and other data related to methods Data notes: - The following sets of proteins are included in the protein_lists folder - set05 : homotypic TMD dataset (combined ETRA, NMR, X-ray) - set07 : test data for machine learning - set08 : train data for machine learning - folders labelled "old, deprecated data" refer to an older machine-learning model, trained on a slightly modified set05. - the hierarchical data structure in THOIPA_data.7zip should in most cases be self-explanatory. Also, references and code for the processing of each file can be found in [thoipapy software][3] version 1.1.3. Most data can be recreated using the open-source thoipapy software. [1]: [2]: [3]: [4]: [5]: [6]: