Benchmarking a Wide Range of Chemical Descriptors for Drug-Target Interaction Prediction Using a Chemogenomic Approach

Mol Inform. 2014 Dec;33(11-12):719-31. doi: 10.1002/minf.201400066. Epub 2014 Nov 24.

Abstract

The identification of drug-target interactions, or interactions between drug candidate compounds and target candidate proteins, is a crucial process in genomic drug discovery. In silico chemogenomic methods are recently recognized as a promising approach for genome-wide scale prediction of drug-target interactions, but the prediction performance depends heavily on the descriptors and similarity measures of drugs and proteins. In this paper, we investigated the performance of various descriptors and similarity measures of drugs and proteins for the drug-target interaction prediction using a chemogenomic approach. We compared the prediction accuracy of 18 chemical descriptors of drugs (e.g., ECFP, FCFP,E-state, CDK, KlekotaRoth, MACCS, PubChem, Dragon, KCF-S, and graph kernels) and 4 descriptors of proteins (e.g., amino acid composition, domain profile, local sequence similarity, and string kernel) on about one hundred thousand drug-target interactions. We examined the combinatorial effects of drug descriptors and protein descriptors using the same benchmark data under several experimental conditions. Large-scale experiments showed that our proposed KCF-S descriptor worked the best in terms of prediction accuracy. The comparative results are expected to be useful for selecting chemical descriptors in various pharmaceutical applications.

Keywords: Chemogenomics; Descriptors; Drug-target interactions; Fingerprint; Machine learning.

Publication types

  • Review