Dataset
-------
The *e*Repo-ORP repository contains data generated for the repositioning of [DrugBank][1] drugs to [Orphanet][2] proteins. The following files are included:
### Lists
**`drugbank-orphanet.list`**
A list of pairs of DrugBank and Orphanet proteins sorted by the [*e*MatchSite][3] score.
* 1st column is the rank
* 2nd column is the UniProt-ID of the DrugBank protein
* 3rd column is the UniProt-ID of the Orphanet protein
* 4th column is the *e*MatchSite score for Drugbank->Orphanet
* 5th column is Spearman’s ρ for structure-based virtual screening with Vina
* 6th column is the TM-score for Drugbank->Orphanet
* 7th column is the GDT-score estimated for the *e*Thread model of the DrugBank protein
* 8th column is the PDB-ID of the template used to build the *e*Thread model of the DrugBank protein
* 9th column is the sequence identity between the template and the DrugBank protein
* 10th column is the GDT-score estimated for the *e*Thread model of the Orphanet protein
* 11th column is the PDB-ID of the template used to build the *e*Thread model of the Orphanet protein
* 12th column is the sequence identity between the template and the Orphanet protein
* 13th column is the confidence for the top-ranked binding site predicted by *e*FindSite in the DrugBank protein model
* 14th column is the confidence for the top-ranked binding site predicted by *e*FindSite in the Orphanet protein model
* 15th column is the list of DrugBank drugs separated by colons
### Data
**`data-drugbank.tar.gz`**
A tarball containing the following files for [DrugBank][4] targets ([Q9RVD6][5] as an example):
* structure model in PDB format (`Q9RVD6.pdb`)
* sequence profile by [PROFILpro 1.1][6] (`Q9RVD6.profile`)
* secondary structure by [PSIPRED 4.0][7] (`Q9RVD6.ss2`)
* pockets by [*e*FindSite 1.3][8] (`Q9RVD6-efindsite.pockets.dat`)
* pockets by [*e*FindSite 1.3][9] in PDB format (`Q9RVD6-efindsite.pockets.pdb`)
* template-target structure alignments by [*e*FindSite 1.3][10] (`Q9RVD6-efindsite.alignments.dat`)
* ligands bound to templates extracted by [*e*FindSite 1.3][11] (`Q9RVD6-efindsite.ligands.sdf`)
**`data-orphanet.tar.gz`**
A tarball containing the following files for [Orphanet][12] targets ([P10644][13] as an example):
* structure model in PDB format (`P10644.pdb`)
* sequence profile by [PROFILpro 1.1][14] (`P10644.profile`)
* secondary structure by [PSIPRED 4.0][15] (`P10644.ss2`)
* pockets by [*e*FindSite 1.3][16] (`P10644-efindsite.pockets.dat`)
* pockets by [*e*FindSite 1.3][17] in PDB format (`P10644-efindsite.pockets.pdb`)
* template-target structure alignments by [*e*FindSite 1.3][18] (`P10644-efindsite.alignments.dat`)
* ligands bound to templates extracted by [*e*FindSite 1.3][19] (`P10644-efindsite.ligands.sdf`)
### Complexes
**`data-drugbank-complexes.tar.gz`**
A tarball containing structure models of [DrugBank][20] complexes. For example, a PDB file **`P06239-DB08901.pdb`** contains the model of ponatinib ([DB08901][21]) bound to tyrosine-protein kinase Lck ([P06239][22]). The REMARK section provides several energy scores calculated for this complex with [DFIRE][23], [DSX][24], and [LPC][25].
**`data-orphanet-complexes.tar.gz`**
A tarball containing complex models of [DrugBank][26] drugs repositioned to [Orphanet][27] proteins. For example, a PDB file **`Q9ULC3-DB08901-P06239.pdb`** contains the model of ponatinib ([DB08901][28]) repositioned to Ras-related protein Rab-23 ([Q9ULC3][29]) through its local alignment to tyrosine-protein kinase Lck ([P06239][30]). The REMARK section provides several energy scores calculated for this complex with [DFIRE][23], [DSX][24], and [LPC][25].
[1]: https://www.drugbank.ca/
[2]: http://www.orpha.net/
[3]: https://www.ncbi.nlm.nih.gov/pubmed/25232727
[4]: https://www.drugbank.ca/
[5]: http://www.uniprot.org/uniprot/Q9RVD6
[6]: https://www.ncbi.nlm.nih.gov/pubmed/15980571
[7]: https://www.ncbi.nlm.nih.gov/pubmed/10493868
[8]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[9]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[10]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[11]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[12]: http://www.orpha.net/
[13]: http://www.uniprot.org/uniprot/P10644
[14]: https://www.ncbi.nlm.nih.gov/pubmed/15980571
[15]: https://www.ncbi.nlm.nih.gov/pubmed/10493868
[16]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[17]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[18]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[19]: https://www.ncbi.nlm.nih.gov/pubmed/23838840
[20]: https://www.drugbank.ca/
[21]: https://www.drugbank.ca/drugs/DB08901
[22]: http://www.uniprot.org/uniprot/P06239
[23]: https://www.ncbi.nlm.nih.gov/pubmed/15801826
[24]: https://www.ncbi.nlm.nih.gov/pubmed/21863864
[25]: https://www.ncbi.nlm.nih.gov/pubmed/10320401
[26]: https://www.drugbank.ca/
[27]: http://www.orpha.net/
[28]: https://www.drugbank.ca/drugs/DB08901
[29]: http://www.uniprot.org/uniprot/Q9ULC3
[30]: http://www.uniprot.org/uniprot/P06239