Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data

IEEE J Biomed Health Inform. 2015 Mar;19(2):698-708. doi: 10.1109/JBHI.2014.2316508. Epub 2014 Apr 10.

Abstract

Recent developments in molecular biology and techniques for genome-wide data acquisition have resulted in abundance of data to profile genes and predict their function. These datasets may come from diverse sources and it is an open question how to commonly address them and fuse them into a joint prediction model. A prevailing technique to identify groups of related genes that exhibit similar profiles is profile-based clustering. Cluster inference may benefit from consensus across different clustering models. In this paper, we propose a technique that develops separate gene clusters from each of available data sources and then fuses them by means of nonnegative matrix factorization. We use gene profile data on the budding yeast S. cerevisiae to demonstrate that this approach can successfully integrate heterogeneous datasets and yield high-quality clusters that could otherwise not be inferred by simply merging the gene profiles prior to clustering.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Models, Statistical*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism