Identifying "Many-to-Many" Relationships between Gene-Expression Data and Drug-Response Data via Sparse Binary Matching

IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):165-176. doi: 10.1109/TCBB.2018.2849708. Epub 2018 Jun 22.

Abstract

Identifying gene-drug patterns is a critical step in pharmacology for unveiling disease mechanisms and drug discovery. The availability of high-throughput technologies accumulates massive large-scale pharmacological and genomic data, and thus provides a new substantial opportunity to deeply understand how the oncogenic genes and the therapeutic drugs relate to each other. However, most previous studies merely used the pharmacological and genomic datasets without any prior knowledge to infer the gene-drug patterns. Here, we proposed a novel network-guided sparse binary matching model (NSBM) to decode these relationships hidden in the datasets. Not only the large-scale gene-expression data and drug-response data are jointly analyzed in our method, but also the additional prior information of genes and drugs are integrated into the form of network-based regularization. The essential structure of the NSBM model is a convex quadratic minimization problem with network-based penalties. It was demonstrated to be superior when compared with two benchmark methods through extensive experiments on both synthetic and empirical data. Posterior validation, including gene-ontology and enrichment analysis, confirmed the effectiveness of NSBM in revealing gene-drug patterns on a large-scale heterogeneous data source.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Antineoplastic Agents / pharmacology
  • Computational Biology
  • Databases, Genetic
  • Gene Expression Profiling
  • Genes, Neoplasm / drug effects
  • Genes, Neoplasm / genetics
  • Humans
  • Pharmacogenetics / methods*
  • Transcriptome* / drug effects
  • Transcriptome* / genetics

Substances

  • Antineoplastic Agents