metamicrobiomeR: an R package for analysis of microbiome relative abundance data using zero-inflated beta GAMLSS and meta-analysis across studies using random effects models

BMC Bioinformatics. 2019 Apr 16;20(1):188. doi: 10.1186/s12859-019-2744-2.

Abstract

Background: The rapid growth of high-throughput sequencing-based microbiome profiling has yielded tremendous insights into human health and physiology. Data generated from high-throughput sequencing of 16S rRNA gene amplicons are often preprocessed into composition or relative abundance. However, reproducibility has been lacking due to the myriad of different experimental and computational approaches taken in these studies. Microbiome studies may report varying results on the same topic, therefore, meta-analyses examining different microbiome studies to provide consistent and robust results are important. So far, there is still a lack of implemented methods to properly examine differential relative abundances of microbial taxonomies and to perform meta-analysis examining the heterogeneity and overall effects across microbiome studies.

Results: We developed an R package 'metamicrobiomeR' that applies Generalized Additive Models for Location, Scale and Shape (GAMLSS) with a zero-inflated beta (BEZI) family (GAMLSS-BEZI) for analysis of microbiome relative abundance datasets. Both simulation studies and application to real microbiome data demonstrate that GAMLSS-BEZI well performs in testing differential relative abundances of microbial taxonomies. Importantly, the estimates from GAMLSS-BEZI are log (odds ratio) of relative abundances between comparison groups and thus are analogous between microbiome studies. As such, we also apply random effects meta-analysis models to pool estimates and their standard errors across microbiome studies. We demonstrate the meta-analysis examples and highlight the utility of our package on four studies comparing gut microbiomes between male and female infants in the first six months of life.

Conclusions: GAMLSS-BEZI allows proper examination of microbiome relative abundance data. Random effects meta-analysis models can be directly applied to pool comparable estimates and their standard errors to evaluate the overall effects and heterogeneity across microbiome studies. The examples and workflow using our 'metamicrobiomeR' package are reproducible and applicable for the analyses and meta-analyses of other microbiome studies.

Keywords: GAMLSS; Gender; Infant; Meta-analysis; Microbiome; Pooling estimates; Random effect; Relative abundance; Zero-inflated beta.

MeSH terms

  • Computational Biology / methods*
  • DNA, Bacterial / analysis
  • DNA, Bacterial / genetics
  • Female
  • Gastrointestinal Microbiome*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Infant
  • Male
  • Models, Statistical*
  • RNA, Ribosomal, 16S / analysis
  • RNA, Ribosomal, 16S / genetics
  • Software*

Substances

  • DNA, Bacterial
  • RNA, Ribosomal, 16S