Identifying Gene Signatures for Cancer Drug Repositioning Based on Sample Clustering

IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):953-965. doi: 10.1109/TCBB.2020.3019781. Epub 2022 Apr 1.

Abstract

Drug repositioning is an important approach for drug discovery. Computational drug repositioning approaches typically use a gene signature to represent a particular disease and connect the gene signature with drug perturbation profiles. Although disease samples, especially from cancer, may be heterogeneous, most existing methods consider them as a homogeneous set to identify differentially expressed genes (DEGs)for further determining a gene signature. As a result, some genes that should be in a gene signature may be averaged off. In this study, we propose a new framework to identify gene signatures for cancer drug repositioning based on sample clustering (GS4CDRSC). GS4CDRSC first groups samples into several clusters based on their gene expression profiles. Second, an existing method is applied to the samples in each cluster for generating a list of DEGs. Then a weighting approach is used to identify an intergrated gene signature from all the lists of DEGs. The integrated gene signature is used to connect with drug perturbation profiles in the Connectivity Map (CMap)database to generate a list of drug candidates. GS4CDRSC has been tested with several cancer datasets and existing methods. The computational results show that GS4CDRSC outperforms those methods without the sample clustering and weighting approaches in terms of both number and rate of predicted known drugs for specific cancers.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Databases, Factual
  • Drug Repositioning* / methods
  • Humans
  • Neoplasms* / drug therapy
  • Neoplasms* / genetics
  • Transcriptome