Managing batch effects in microbiome data

Brief Bioinform. 2020 Dec 1;21(6):1954-1970. doi: 10.1093/bib/bbz105.

Abstract

Microbial communities have been increasingly studied in recent years to investigate their role in ecological habitats. However, microbiome studies are difficult to reproduce or replicate as they may suffer from confounding factors that are unavoidable in practice and originate from biological, technical or computational sources. In this review, we define batch effects as unwanted variation introduced by confounding factors that are not related to any factors of interest. Computational and analytical methods are required to remove or account for batch effects. However, inherent microbiome data characteristics (e.g. sparse, compositional and multivariate) challenge the development and application of batch effect adjustment methods to either account or correct for batch effects. We present commonly encountered sources of batch effects that we illustrate in several case studies. We discuss the limitations of current methods, which often have assumptions that are not met due to the peculiarities of microbiome data. We provide practical guidelines for assessing the efficiency of the methods based on visual and numerical outputs and a thorough tutorial to reproduce the analyses conducted in this review.

Keywords: batch sources; methods assessment; methods selection; systematic batch effects; unwanted variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology*
  • Data Analysis
  • Microbiota*