Predicting target proteins for drug candidate compounds based on drug-induced gene expression data in a chemical structure-independent manner

BMC Med Genomics. 2015 Dec 18:8:82. doi: 10.1186/s12920-015-0158-1.

Abstract

Background: Phenotype-based high-throughput screening is a useful technique for identifying drug candidate compounds that have a desired phenotype. However, the molecular mechanisms of the hit compounds remain unknown, and substantial effort is required to identify the target proteins associated with the phenotype.

Methods: In this study, we propose a new method to predict target proteins of drug candidate compounds based on drug-induced gene expression data in Connectivity Map and a machine learning classification technique, which we call the "transcriptomic approach."

Results: Unlike existing methods such as the chemogenomic approach, the transcriptomic approach enabled the prediction of target proteins without dependence on prior knowledge of compound chemical structures. The prediction accuracy of the chemogenomic approach was highly depended on compounds structure similarities in data sets. In contrast, the prediction accuracy of the transcriptomic approach was maintained at a sufficient level, even for benchmark data consisting of structurally diverse compounds.

Conclusions: The transcriptomic approach reported here is expected to be a useful tool for structure-independent prediction of target proteins for drug candidate compounds.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antineoplastic Agents* / chemistry
  • Antineoplastic Agents* / pharmacokinetics
  • Antineoplastic Agents* / pharmacology
  • Drug Screening Assays, Antitumor / methods
  • Gene Expression Regulation, Neoplastic / drug effects*
  • HL-60 Cells
  • Humans
  • MCF-7 Cells
  • Machine Learning*
  • Neoplasms* / drug therapy
  • Neoplasms* / genetics
  • Neoplasms* / metabolism
  • Structure-Activity Relationship
  • Transcriptome*

Substances

  • Antineoplastic Agents