Prediction of interactiveness between small molecules and enzymes by combining gene ontology and compound similarity

J Comput Chem. 2010 Jun;31(8):1766-76. doi: 10.1002/jcc.21467.

Abstract

Determination of whether a small organic molecule interacts with an enzyme can help to understand the molecular and cellular functions of organisms, and the metabolic pathways. In this research, we present a prediction model, by combining compound similarity and enzyme similarity, to predict the interactiveness between small molecules and enzymes. A dataset consisting of 2859 positive couples of small molecule and enzyme and 286,056 negative couples was employed. Compound similarity is a measurement of how similar two small molecules are, proposed by Hattori et al., J Am Chem Soc 2003, 125, 11853 which can be availed at http://www.genome.jp/ligand-bin/search_compound, while enzyme similarity was obtained by three ways, they are blast method, using gene ontology items and functional domain composition. Then a new distance between a pair of couples was established and nearest neighbor algorithm (NNA) was employed to predict the interactiveness of enzymes and small molecules. A data distribution strategy was adopted to get a better data balance between the positive samples and the negative samples during training the prediction model, by singling out one-fourth couples as testing samples and dividing the rest data into seven training datasets-the rest positive samples were added into each training dataset while only the negative samples were divided. In this way, seven NNAs were built. Finally, simple majority voting system was applied to integrate these seven models to predict the testing dataset, which was demonstrated to have better prediction results than using any single prediction model. As a result, the highest overall prediction accuracy achieved 97.30%.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Genetic
  • Enzymes / chemistry
  • Enzymes / genetics*
  • Enzymes / metabolism*
  • Molecular Conformation
  • Molecular Weight
  • Organic Chemicals / chemistry*
  • Organic Chemicals / metabolism*

Substances

  • Enzymes
  • Organic Chemicals