Machine Learning Helps Identify New Drug Mechanisms in Triple-Negative Breast Cancer

IEEE Trans Nanobioscience. 2018 Jul;17(3):251-259. doi: 10.1109/TNB.2018.2851997. Epub 2018 Jul 2.

Abstract

This paper demonstrates the ability of mach- ine learning approaches to identify a few genes among the 23,398 genes of the human genome to experiment on in the laboratory to establish new drug mechanisms. As a case study, this paper uses MDA-MB-231 breast cancer single-cells treated with the antidiabetic drug metformin. We show that mixture-model-based unsupervised methods with validation from hierarchical clustering can identify single-cell subpopulations (clusters). These clusters are characterized by a small set of genes (1% of the genome) that have significant differential expression across the clusters and are also highly correlated with pathways with anticancer effects driven by metformin. Among the identified small set of genes associated with reduced breast cancer incidence, laboratory experiments on one of the genes, CDC42, showed that its downregulation by metformin inhibited cancer cell migration and proliferation, thus validating the ability of machine learning approaches to identify biologically relevant candidates for laboratory experiments. Given the large size of the human genome and limitations in cost and skilled resources, the broader impact of this work in identifying a small set of differentially expressed genes after drug treatment lies in augmenting the drug-disease knowledge of pharmacogenomics experts in laboratory investigations, which could help establish novel biological mechanisms associated with drug response in diseases beyond breast cancer.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Antineoplastic Agents / pharmacology*
  • Cell Line, Tumor
  • Cluster Analysis
  • Female
  • Gene Expression Profiling / methods
  • Gene Expression Regulation, Neoplastic / drug effects*
  • Genomics / methods
  • Humans
  • Metformin / pharmacology
  • Single-Cell Analysis / methods*
  • Triple Negative Breast Neoplasms* / genetics
  • Triple Negative Breast Neoplasms* / metabolism
  • Unsupervised Machine Learning*

Substances

  • Antineoplastic Agents
  • Metformin