Model-free prediction of microbiome compositions

Microbiome. 2024 Feb 1;12(1):17. doi: 10.1186/s40168-023-01721-9.

Abstract

Background: The recent recognition of the importance of the microbiome to the host's health and well-being has yielded efforts to develop therapies that aim to shift the microbiome from a disease-associated state to a healthier one. Direct manipulation techniques of the species' assemblage are currently available, e.g., using probiotics or narrow-spectrum antibiotics to introduce or eliminate specific taxa. However, predicting the species' abundances at the new state remains a challenge, mainly due to the difficulties of deciphering the delicate underlying network of ecological interactions or constructing a predictive model for such complex ecosystems.

Results: Here, we propose a model-free method to predict the species' abundances at the new steady state based on their presence/absence configuration by utilizing a multi-dimensional k-nearest-neighbors (kNN) regression algorithm. By analyzing data from numeric simulations of ecological dynamics, we show that our predictions, which consider the presence/absence of all species holistically, outperform both the null model that uses the statistics of each species independently and a predictive neural network model. We analyze real metagenomic data of human-associated microbial communities and find that by relying on a small number of "neighboring" samples, i.e., samples with similar species assemblage, the kNN predicts the species abundance better than the whole-cohort average. By studying both real metagenomic and simulated data, we show that the predictability of our method is tightly related to the dissimilarity-overlap relationship of the training data.

Conclusions: Our results demonstrate how model-free methods can prove useful in predicting microbial communities and may facilitate the development of microbial-based therapies. Video Abstract.

Publication types

  • Video-Audio Media

MeSH terms

  • Humans
  • Metagenome
  • Microbiota* / genetics