Batch effects correction for microbiome data with Dirichlet-multinomial regression

Bioinformatics. 2019 Mar 1;35(5):807-814. doi: 10.1093/bioinformatics/bty729.

Abstract

Motivation: Metagenomic sequencing techniques enable quantitative analyses of the microbiome. However, combining the microbial data from these experiments is challenging due to the variations between experiments. The existing methods for correcting batch effects do not consider the interactions between variables-microbial taxa in microbial studies-and the overdispersion of the microbiome data. Therefore, they are not applicable to microbiome data.

Results: We develop a new method, Bayesian Dirichlet-multinomial regression meta-analysis (BDMMA), to simultaneously model the batch effects and detect the microbial taxa associated with phenotypes. BDMMA automatically models the dependence among microbial taxa and is robust to the high dimensionality of the microbiome and their association sparsity. Simulation studies and real data analysis show that BDMMA can successfully adjust batch effects and substantially reduce false discoveries in microbial meta-analyses.

Availability and implementation: An R package" BDMMA" for Windows and Linux is available at https://github.com/DAIZHENWEI/BDMMA/BDMMA, and a version for MacOS is provided at https://github.com/DAIZHENWEI/BDMMA/BDMMA_MacOS.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Microbiota*
  • Regression Analysis
  • Research Design