A GLM-based latent variable ordination method for microbiome samples

Biometrics. 2018 Jun;74(2):448-457. doi: 10.1111/biom.12775. Epub 2017 Oct 9.

Abstract

Distance-based ordination methods, such as principal coordinates analysis (PCoA), are widely used in the analysis of microbiome data. However, these methods are prone to pose a potential risk of misinterpretation about the compositional difference in samples across different populations if there is a difference in dispersion effects. Accounting for high sparsity and overdispersion of microbiome data, we propose a GLM-based Ordination Method for Microbiome Samples (GOMMS) in this article. This method uses a zero-inflated quasi-Poisson (ZIQP) latent factor model. An EM algorithm based on the quasi-likelihood is developed to estimate parameters. It performs comparatively to the distance-based approach when dispersion effects are negligible and consistently better when dispersion effects are strong, where the distance-based approach sometimes yields undesirable results. The estimated latent factors from GOMMS can be used to associate the microbiome community with covariates or outcomes using the standard multivariate tests, which can be investigated in future confirmatory experiments. We illustrate the method in simulations and an analysis of microbiome samples from nasopharynx and oropharynx.

Keywords: 16S sequencing; Factor models; Microbiome; Zero-inflated models.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Humans
  • Microbiota*
  • Nasopharynx / microbiology
  • Oropharynx / microbiology
  • Poisson Distribution*