Identifying and Predicting Novelty in Microbiome Studies

mBio. 2018 Nov 13;9(6):e02099-18. doi: 10.1128/mBio.02099-18.

Abstract

With the expansion of microbiome sequencing globally, a key challenge is to relate new microbiome samples to the existing space of microbiome samples. Here, we present Microbiome Search Engine (MSE), which enables the rapid search of query microbiome samples against a large, well-curated reference microbiome database organized by taxonomic similarity at the whole-microbiome level. Tracking the microbiome novelty score (MNS) over 8 years of microbiome depositions based on searching in more than 100,000 global 16S rRNA gene amplicon samples, we detected that the structural novelty of human microbiomes is approaching saturation and likely bounded, whereas that in environmental habitats remains 5 times higher. Via the microbiome focus index (MFI), which is derived from the MNS and microbiome attention score (MAS), we objectively track and compare the structural-novelty and attracted-attention scores of individual microbiome samples and projects, and we predict future trends in the field. For example, marine and indoor environments and mother-baby interactions are likely to receive disproportionate additional attention based on recent trends. Therefore, MNS, MAS, and MFI are proposed "alt-metrics" for evaluating a microbiome project or prospective developments in the microbiome field, both of which are done in the context of existing microbiome big data.IMPORTANCE We introduce two concepts to quantify the novelty of a microbiome. The first, the microbiome novelty score (MNS), allows identification of microbiomes that are especially different from what is already sequenced. The second, the microbiome attention score (MAS), allows identification of microbiomes that have many close neighbors, implying that considerable scientific attention is devoted to their study. By computing a microbiome focus index based on the MNS and MAS, we objectively track and compare the novelty and attention scores of individual microbiome samples and projects over time and predict future trends in the field; i.e., we work toward yielding fundamentally new microbiomes rather than filling in the details. Therefore, MNS, MAS, and MFI can serve as "alt-metrics" for evaluating a microbiome project or prospective developments in the microbiome field, both of which are done in the context of existing microbiome big data.

Keywords: bioinformatics; community similarity; data mining; database search; microbial ecology; microbiome; microbiome novelty; novelty; search.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology*
  • Databases, Factual*
  • Humans
  • Microbiota / genetics*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics
  • Sequence Analysis, DNA

Substances

  • RNA, Ribosomal, 16S