Multi-way clustering of microarray data using probabilistic sparse matrix factorization

Bioinformatics. 2005 Jun:21 Suppl 1:i144-51. doi: 10.1093/bioinformatics/bti1041.

Abstract

Motivation: We address the problem of multi-way clustering of microarray data using a generative model. Our algorithm, probabilistic sparse matrix factorization (PSMF), is a probabilistic extension of a previous hard-decision algorithm for this problem. PSMF allows for varying levels of sensor noise in the data, uncertainty in the hidden prototypes used to explain the data and uncertainty as to the prototypes selected to explain each data vector.

Results: We present experimental results demonstrating that our method can better recover functionally-relevant clusterings in mRNA expression data than standard clustering techniques, including hierarchical agglomerative clustering, and we show that by computing probabilities instead of point estimates, our method avoids converging to poor solutions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis*
  • Computational Biology / methods*
  • Genome
  • Humans
  • Likelihood Functions
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Probability
  • RNA, Messenger / metabolism
  • Software

Substances

  • RNA, Messenger