Transfer learning for cytochrome P450 isozyme selectivity prediction

J Bioinform Comput Biol. 2011 Aug;9(4):521-40. doi: 10.1142/s0219720011005434.

Abstract

In the drug discovery process, the metabolic fate of drugs is crucially important to prevent drug-drug interactions. Therefore, P450 isozyme selectivity prediction is an important task for screening drugs of appropriate metabolism profiles. Recently, large-scale activity data of five P450 isozymes (CYP1A2 CYP2C9, CYP3A4, CYP2D6, and CYP2C19) have been obtained using quantitative high-throughput screening with a bioluminescence assay. Although some isozymes share similar selectivities, conventional supervised learning algorithms independently learn a prediction model from each P450 isozyme. They are unable to exploit the other P450 isozyme activity data to improve the predictive performance of each P450 isozyme's selectivity. To address this issue, we apply transfer learning that uses activity data of the other isozymes to learn a prediction model from multiple P450 isozymes. After using the large-scale P450 isozyme selectivity dataset for five P450 isozymes, we evaluate the model's predictive performance. Experimental results show that, overall, our algorithm outperforms conventional supervised learning algorithms such as support vector machine (SVM), Weighted k-nearest neighbor classifier, Bagging, Adaboost, and latent semantic indexing (LSI). Moreover, our results show that the predictive performance of our algorithm is improved by exploiting the multiple P450 isozyme activity data in the learning process. Our algorithm can be an effective tool for P450 selectivity prediction for new chemical entities using multiple P450 isozyme activity data.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Computational Biology
  • Cytochrome P-450 Enzyme System / metabolism*
  • Databases, Factual
  • Drug Discovery / statistics & numerical data
  • Drug Evaluation, Preclinical / statistics & numerical data*
  • Drug Interactions
  • Isoenzymes / metabolism
  • Substrate Specificity
  • Support Vector Machine

Substances

  • Isoenzymes
  • Cytochrome P-450 Enzyme System