Compositional data analysis in epidemiology

Stat Methods Med Res. 2018 Jun;27(6):1878-1891. doi: 10.1177/0962280216671536. Epub 2016 Oct 6.

Abstract

Compositional data analysis refers to analyzing relative information, based on ratios between the variables in a data set. Data from epidemiology are usually treated as absolute information in an analysis. We outline the differences in both approaches for univariate and multivariate statistical analyses, using illustrative data sets from Austrian districts. Not only the results of the analyses can differ, but in particular the interpretation differs. It is demonstrated that the compositional data analysis approach leads to new and interesting insights.

Keywords: Euclidean geometry; Log-ratio approach; compositional data; isometric log-ratio coordinates; multivariate statistics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Austria
  • Confounding Factors, Epidemiologic
  • Data Analysis*
  • Data Interpretation, Statistical
  • Epidemiologic Studies*
  • Multivariate Analysis