Agglomerative joint clustering of metabolic data with spike at zero: A Bayesian perspective

Biom J. 2016 Mar;58(2):387-96. doi: 10.1002/bimj.201400110. Epub 2015 Jun 22.

Abstract

In many biological applications, for example high-dimensional metabolic data, the measurements consist of several continuous measurements of subjects or tissues over multiple attributes or metabolites. Measurement values are put in a matrix with subjects in rows and attributes in columns. The analysis of such data requires grouping subjects and attributes to provide a primitive guide toward data modeling. A common approach is to group subjects and attributes separately, and construct a two-dimensional dendrogram tree, once on rows and then on columns. This simple approach provides a grouping visualization through two separate trees, which is difficult to interpret jointly. When a joint grouping of rows and columns is of interest, it is more natural to partition the data matrix directly. Our suggestion is to build a dendrogram on the matrix directly, thus generalizing the two-dimensional dendrogram tree to a three-dimensional forest. The contribution of this research to the statistical analysis of metabolic data is threefold. First, a novel spike-and-slab model in various hierarchies is proposed to identify discriminant rows and columns. Second, an agglomerative approach is suggested to organize joint clusters. Third, a new visualization tool is invented to demonstrate the collection of joint clusters. The new method is motivated over gas chromatography mass spectrometry (GCMS) metabolic data, but can be applied to other continuous measurements with spike at zero property.

Keywords: Agglomerative clustering; Bayesian clustering; Dendrogram; Metabolic data; Spike-and-slab model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis / metabolism
  • Bayes Theorem
  • Cluster Analysis
  • Gas Chromatography-Mass Spectrometry
  • Metabolomics*
  • Mutation
  • Statistics as Topic / methods*