Compositional data in neuroscience: If you've got it, log it!

J Neurosci Methods. 2016 Sep 15:271:154-9. doi: 10.1016/j.jneumeth.2016.07.008. Epub 2016 Jul 20.

Abstract

Background: Compositional data sum to a constant value, for example, 100%. In neuroscience, such data are common, for example, when estimating the percentage of time spent for a behavioural response in a limited choice situation or a neurochemical within brain tissue. Compositional data have a distinct structure which complicates analysis and makes inappropriate standard statistical analyses such as general linear model analyses and principal components or factor analysis (whether Q-mode or R-mode), as a result of the correlation of the components, the dependence of the pairwise covariance on which other components are included in the analysis, and the bounded nature of the data.

New method: This problem has been recognised in disciplines such as geology and zoology for decades, where log ratio methods have been successfully applied. The isometric log ratio (ilr) method has some particular advantages.

Comparison with existing method: Classical statistical methods such as t-tests, ANOVAs, and multivariate analyses are invalid when applied to compositional data.

Conclusions: The compositional data analysis methods developed by statisticians and used by geologists and zoologists should be considered for compositional data analysis in neuroscience.

Keywords: Additive log ratio; Centred log ratio; Compositional data; Isometric log ratio; Logit transformation; Statistical dependence.

Publication types

  • Review

MeSH terms

  • Animals
  • Data Interpretation, Statistical*
  • Humans
  • Neurosciences / methods*