Directional Gaussian Mixture Models of the Gut Microbiome Elucidate Microbial Spatial Structure

mSystems. 2021 Dec 21;6(6):e0081721. doi: 10.1128/mSystems.00817-21. Epub 2021 Nov 9.

Abstract

The gut microbiome is spatially heterogeneous, with environmental niches contributing to the distribution and composition of microbial populations. A recently developed mapping technology, MaPS-seq, aims to characterize the spatial organization of the gut microbiome by providing data about local microbial populations. However, information about the global arrangement of these populations is lost by MaPS-seq. To address this, we propose a class of Gaussian mixture models (GMM) with spatial dependencies between mixture components in order to computationally recover the relative spatial arrangement of microbial communities. We demonstrate on synthetic data that our spatial models can identify global spatial dynamics, accurately cluster data, and improve parameter inference over a naive GMM. We applied our model to three MaPS-seq data sets taken from various regions of the mouse intestine. On cecal and distal colon data sets, we find our model accurately recapitulates known spatial behaviors of the gut microbiome, including compositional differences between mucus and lumen-associated populations. Our model also seems to capture the role of a pH gradient on microbial populations in the mouse ileum and proposes new behaviors as well. IMPORTANCE The spatial arrangement of the microbes in the gut microbiome is a defining characteristic of its behavior. Various experimental studies have attempted to provide glimpses into the mechanisms that contribute to microbial arrangements. However, many of these descriptions are qualitative. We developed a computational method that takes microbial spatial data and learns many of the experimentally validated spatial factors. We can then use our model to propose previously unknown spatial behaviors. Our results demonstrate that the gut microbiome, while exceptionally large, has predictable spatial patterns that can be used to help us understand its role in health and disease.

Keywords: Gaussian process; MaPS-seq; computational biology; machine learning; mathematical modeling; microbiome; probabilistic models; spatial structure.