Multiple gene expression profile alignment for microarray time-series data clustering

Bioinformatics. 2010 Sep 15;26(18):2281-8. doi: 10.1093/bioinformatics/btq422. Epub 2010 Jul 16.

Abstract

Motivation: Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints. Traditional clustering methods based on conventional similarity measures are not always suitable for clustering time-series data. A few methods have been proposed recently for clustering microarray time-series, which take the temporal dimension of the data into account. The inherent principle behind these methods is to either define a similarity measure appropriate for temporal expression data, or pre-process the data in such a way that the temporal relationships between and within the time-series are considered during the subsequent clustering phase.

Results: We introduce pairwise gene expression profile alignment, which vertically shifts two profiles in such a way that the area between their corresponding curves is minimal. Based on the pairwise alignment operation, we define a new distance function that is appropriate for time-series profiles. We also introduce a new clustering method that involves multiple expression profile alignment, which generalizes pairwise alignment to a set of profiles. Extensive experiments on well-known datasets yield encouraging results of at least 80% classification accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Gene Expression
  • Gene Expression Profiling / methods*
  • Oligonucleotide Array Sequence Analysis / methods*