CoINcIDE: A framework for discovery of patient subtypes across multiple datasets

Genome Med. 2016 Mar 9;8(1):27. doi: 10.1186/s13073-016-0281-4.

Abstract

Patient disease subtypes have the potential to transform personalized medicine. However, many patient subtypes derived from unsupervised clustering analyses on high-dimensional datasets are not replicable across multiple datasets, limiting their clinical utility. We present CoINcIDE, a novel methodological framework for the discovery of patient subtypes across multiple datasets that requires no between-dataset transformations. We also present a high-quality database collection, curatedBreastData, with over 2,500 breast cancer gene expression samples. We use CoINcIDE to discover novel breast and ovarian cancer subtypes with prognostic significance and novel hypothesized ovarian therapeutic targets across multiple datasets. CoINcIDE and curatedBreastData are available as R packages.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Breast Neoplasms / diagnosis
  • Breast Neoplasms / genetics
  • Cluster Analysis
  • Computational Biology / methods*
  • Computer Simulation
  • Datasets as Topic*
  • Female
  • Gene Expression Profiling
  • Humans
  • Ovarian Neoplasms / diagnosis
  • Ovarian Neoplasms / genetics
  • Prognosis
  • ROC Curve
  • Software*