Predicting protein-ligand binding site using support vector machine with protein properties

IEEE/ACM Trans Comput Biol Bioinform. 2013 Nov-Dec;10(6):1517-29. doi: 10.1109/TCBB.2013.126.

Abstract

Identification of protein-ligand binding site is an important task in structure-based drug design and docking algorithms. In the past two decades, different approaches have been developed to predict the binding site, such as the geometric, energetic, and sequence-based methods. When scores are calculated from these methods, the algorithm for doing classification becomes very important and can affect the prediction results greatly. In this paper, the support vector machine (SVM) is used to cluster the pockets that are most likely to bind ligands with the attributes of geometric characteristics, interaction potential, offset from protein, conservation score, and properties surrounding the pockets. Our approach is compared to LIGSITE, LIGSITE(CSC), SURFNET, Fpocket, PocketFinder, Q-SiteFinder, ConCavity, and MetaPocket on the data set LigASite and 198 drug-target protein complexes. The results show that our approach improves the success rate from 60 to 80 percent at AUC measure and from 61 to 66 percent at top 1 prediction. Our method also provides more comprehensive results than the others.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Area Under Curve
  • Binding Sites
  • Cluster Analysis
  • Computational Biology / methods*
  • Databases, Protein
  • Drug Design
  • Ligands
  • Models, Molecular
  • Probability
  • Protein Binding
  • Protein Conformation
  • Protein Interaction Mapping / methods
  • Proteins / chemistry*
  • Software
  • Support Vector Machine*

Substances

  • Ligands
  • Proteins