Statistical challenges in longitudinal microbiome data analysis

Brief Bioinform. 2022 Jul 18;23(4):bbac273. doi: 10.1093/bib/bbac273.

Abstract

The microbiome is a complex and dynamic community of microorganisms that co-exist interdependently within an ecosystem, and interact with its host or environment. Longitudinal studies can capture temporal variation within the microbiome to gain mechanistic insights into microbial systems; however, current statistical methods are limited due to the complex and inherent features of the data. We have identified three analytical objectives in longitudinal microbial studies: (1) differential abundance over time and between sample groups, demographic factors or clinical variables of interest; (2) clustering of microorganisms evolving concomitantly across time and (3) network modelling to identify temporal relationships between microorganisms. This review explores the strengths and limitations of current methods to fulfill these objectives, compares different methods in simulation and case studies for objectives (1) and (2), and highlights opportunities for further methodological developments. R tutorials are provided to reproduce the analyses conducted in this review.

Keywords: 16S; clustering; compositionality; differential abundance; networks; relative abundance; shotgun sequencing.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Data Analysis*
  • Longitudinal Studies
  • Microbiota*
  • RNA, Ribosomal, 16S

Substances

  • RNA, Ribosomal, 16S