Main content



Loading wiki pages...

Wiki Version:
Dataset ------- **TOUGH-M1** comprises experimental target structures with binding pockets identified by [Fpocket 2.0][1]. The following files are included: ### Lists **`TOUGH-M1_target.list`** A list of 7524 targets **`TOUGH-M1_pocket.list`** A list of pockets identified by [Fpocket 2.0][1] * 1st column is the PDB-ID of target protein * 2nd column is the index of the best pocket * 3rd column is the accuracy of the best pocket (Matthews correlation coefficient over binding residues) **`TOUGH-M1_positive.list`** A list of 505116 positive pairs. These proteins bind chemically similar ligands despite having dissimilar global sequences and structures. * The first two columns are the PDB-IDs of target proteins * 3rd column is the sequence identity between the targets * 4th column is the structure similarity ([TM-score][2] by [Fr-TM-align][3]) * 5th column is the chemical similarity of bound ligands (Tanimoto coefficient by [kcombu][4]) **`TOUGH-M1_negative.list`** A list of 556810 negative pairs. These proteins have dissimilar global sequences and structures and bind chemically dissimilar ligands. * The first two columns are the PDB-IDs of target proteins * 3rd column is the sequence identity between the targets * 4th column is the structure similarity ([TM-score][2] by [Fr-TM-align][3]) * 5th column is the chemical similarity of bound ligands (Tanimoto coefficient by [kcombu][4]) ### Data **`TOUGH-M1_dataset.tar.gz`** A tarball containing the following files ([11asA][5] as an example): * target structure in PDB format (`11asA.pdb`) * target structure in PDBQT format (`11asA.pdbqt`) * bound ligand in PDB format (`11asA00.pdb`) * bound ligand in PDBQT format (`11asA00.pdbqt`) * bound ligand in SDF format (`11asA00.sdf`) * protein-ligand contacts by [LPC][6] (`11asA00.lpc`) * pockets by [Fpocket][1] (`11asA.fpocket`) Results ------- Output files generated for the TOUGH-M1 dataset by several binding site alignment algorithms are available to facilitate comparative benchmarks. In addition, the conformations of 1515 drugs from the DrugBank database docked to TOUGH-M1 targets can be downloaded as well. Note that because of the large size of virtual screening data, docking conformations are split into 3 tarballs. Pocket matching and virtual screening calculations were conducted against pockets predicted by [Fpocket 2.0][1]. The following programs are included: ### Pocket alignment tools [APoc][7] **`APoc-TOUGH-M1_positive.score`** **`APoc-TOUGH-M1_negative.score`** PS-score and *p*-values for positive and negative pairs. **`APoc-TOUGH-M1_positive.tar.gz`** **`APoc-TOUGH-M1_negative.tar.gz`** Output files from [APoc v1.0b15][7] for positive and negative pairs. [G-LoSA][8] **`G-LoSA-TOUGH-M1_positive.score`** **`G-LoSA-TOUGH-M1_negative.score`** GA-score values for positive and negative pairs. **`G-LoSA-TOUGH-M1_positive.tar.gz`** **`G-LoSA-TOUGH-M1_negative.tar.gz`** Output files from [G-LoSA v2.1][8] for positive and negative pairs. [SiteEngine][9] **`SiteEngine-TOUGH-M1_positive.score`** **`SiteEngine-TOUGH-M1_negative.score`** Match score, Total score and T-score values for positive and negative pairs. **`SiteEngine-TOUGH-M1_positive.tar.gz`** **`SiteEngine-TOUGH-M1_negative.tar.gz`** Output files from [SiteEngine 1.0][9] for positive and negative pairs. ### Virtual screening [AutoDock Vina][10] **`Vina-TOUGH-M1_part1.tar.gz`** **`Vina-TOUGH-M1_part2.tar.gz`** **`Vina-TOUGH-M1_part3.tar.gz`** Docked conformations in PDBQT format. [rDock][11] **`rDock-TOUGH-M1_part1.tar.gz`** **`rDock-TOUGH-M1_part2.tar.gz`** **`rDock-TOUGH-M1_part3.tar.gz`** Docked conformations in SDF format. [1]: [2]: [3]: [4]: [5]: [6]: [7]: [8]: [9]: [10]: [11]:
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.