Substituting missing data in compositional analysis

Environ Pollut. 2011 Oct;159(10):2797-800. doi: 10.1016/j.envpol.2011.05.006. Epub 2011 Jun 8.

Abstract

Multivariate analysis of environmental data sets requires the absence of missing values or their substitution by small values. However, if the data is transformed logarithmically prior to the analysis, this solution cannot be applied because the logarithm of a small value might become an outlier. Several methods for substituting the missing values can be found in the literature although none of them guarantees that no distortion of the structure of the data set is produced. We propose a method for the assessment of these distortions which can be used for deciding whether to retain or not the samples or variables containing missing values and for the investigation of the performance of different substitution techniques. The method analyzes the structure of the distances among samples using Mantel tests. We present an application of the method to PCDD/F data measured in samples of terrestrial moss as part of a biomonitoring study.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benzofurans / analysis
  • Bryopsida / chemistry
  • Dibenzofurans, Polychlorinated
  • Environmental Monitoring / methods*
  • Environmental Pollution / statistics & numerical data*
  • Multivariate Analysis
  • Polychlorinated Dibenzodioxins / analogs & derivatives
  • Polychlorinated Dibenzodioxins / analysis
  • Statistics as Topic*

Substances

  • Benzofurans
  • Dibenzofurans, Polychlorinated
  • Polychlorinated Dibenzodioxins