Analysis of T-RFLP data using analysis of variance and ordination methods: a comparative study

J Microbiol Methods. 2008 Sep;75(1):55-63. doi: 10.1016/j.mimet.2008.04.011. Epub 2008 May 16.

Abstract

The analysis of T-RFLP data has developed considerably over the last decade, but there remains a lack of consensus about which statistical analyses offer the best means for finding trends in these data. In this study, we empirically tested and theoretically compared ten diverse T-RFLP datasets derived from soil microbial communities using the more common ordination methods in the literature: principal component analysis (PCA), nonmetric multidimensional scaling (NMS) with Sørensen, Jaccard and Euclidean distance measures, correspondence analysis (CA), detrended correspondence analysis (DCA) and a technique new to T-RFLP data analysis, the Additive Main Effects and Multiplicative Interaction (AMMI) model. Our objectives were i) to determine the distribution of variation in T-RFLP datasets using analysis of variance (ANOVA), ii) to determine the more robust and informative multivariate ordination methods for analyzing T-RFLP data, and iii) to compare the methods based on theoretical considerations. For the 10 datasets examined in this study, ANOVA revealed that the variation from Environment main effects was always small, variation from T-RFs main effects was large, and variation from T-RFxEnvironment (TxE) interactions was intermediate. Larger variation due to TxE indicated larger differences in microbial communities between environments/treatments and thus demonstrated the utility of ANOVA to provide an objective assessment of community dissimilarity. The comparison of statistical methods typically yielded similar empirical results. AMMI, T-RF-centered PCA, and DCA were the most robust methods in terms of producing ordinations that consistently reached a consensus with other methods. In datasets with high sample heterogeneity, NMS analyses with Sørensen and Jaccard distance were the most sensitive for recovery of complex gradients. The theoretical comparison showed that some methods hold distinct advantages for T-RFLP analysis, such as estimations of variation captured, realistic or minimal assumptions about the data, reduced weight placed on rare T-RFs, and uniqueness of solutions. Our results lead us to recommend that method selection be guided by T-RFLP dataset complexity and the outlined theoretical criteria. Finally, we recommend using binary or relativized peak height data with soil-based T-RFLP data for ordination-based exploratory microbial analyses.

Publication types

  • Comparative Study

MeSH terms

  • DNA Fingerprinting / methods*
  • DNA Fingerprinting / statistics & numerical data
  • Multivariate Analysis
  • Polymorphism, Restriction Fragment Length*
  • Research Design
  • Soil Microbiology*
  • Statistics as Topic / standards*