Model-based clustering on the unit sphere with an illustration using gene expression profiles

Biostatistics. 2008 Jan;9(1):66-80. doi: 10.1093/biostatistics/kxm012. Epub 2007 Apr 27.

Abstract

We consider model-based clustering of data that lie on a unit sphere. Such data arise in the analysis of microarray experiments when the gene expressions are standardized so that they have mean 0 and variance 1 across the arrays. We propose to model the clusters on the sphere with inverse stereographic projections of multivariate normal distributions. The corresponding model-based clustering algorithm is described. This algorithm is applied first to simulated data sets to assess the performance of several criteria for determining the number of clusters and to compare its performance with existing methods and second to a real reference data set of standardized gene expression profiles.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Cell Cycle / genetics
  • Cluster Analysis*
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Models, Genetic*
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / methods
  • Yeasts / genetics