Benefits of Iterative Searches of Large Databases to Interpret Large Human Gut Metaproteomic Data Sets

J Proteome Res. 2021 Mar 5;20(3):1522-1534. doi: 10.1021/acs.jproteome.0c00669. Epub 2021 Feb 2.

Abstract

The gut microbiota are increasingly considered as a main partner of human health. Metaproteomics enables us to move from the functional potential revealed by metagenomics to the functions actually operating in the microbiome. However, metaproteome deciphering remains challenging. In particular, confident interpretation of a myriad of MS/MS spectra can only be pursued with smart database searches. Here, we compare the interpretation of MS/MS data sets from 48 individual human gut microbiomes using three interrogation strategies of the dedicated Integrated nonredundant Gene Catalog (IGC 9.9 million genes from 1267 individual fecal samples) together with the Homo sapiens database: the classical single-step interrogation strategy and two iterative strategies (in either two or three steps) aimed at preselecting a reduced-sized, more targeted search space for the final peptide spectrum matching. Both iterative searches outperformed the single-step classical search in terms of the number of peptides and protein clusters identified and the depth of taxonomic and functional knowledge, and this was the most convincing with the three-step approach. However, iterative searches do not help in reducing variability of repeated analyses, which is inherent to the traditional data-dependent acquisition mode, but this variability did not affect the hierarchical relationship between replicates and all other samples.

Keywords: X!Tandem; database search; gut microbiome; metaproteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gastrointestinal Microbiome* / genetics
  • Humans
  • Metagenomics
  • Microbiota*
  • Proteomics
  • Tandem Mass Spectrometry