Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis

Expert Rev Proteomics. 2019 May;16(5):375-390. doi: 10.1080/14789450.2019.1609944. Epub 2019 Apr 30.

Abstract

The study of microbial communities based on the combined analysis of genomic and proteomic data - called metaproteogenomics - has gained increased research attention in recent years. This relatively young field aims to elucidate the functional and taxonomic interplay of proteins in microbiomes and its implications on human health and the environment. Areas covered: This article reviews bioinformatics methods and software tools dedicated to the analysis of data from metaproteomics and metaproteogenomics experiments. In particular, it focuses on the creation of tailored protein sequence databases, on the optimal use of database search algorithms including methods of error rate estimation, and finally on taxonomic and functional annotation of peptide and protein identifications. Expert opinion: Recently, various promising strategies and software tools have been proposed for handling typical data analysis issues in metaproteomics. However, severe challenges remain that are highlighted and discussed in this article; these include: (i) robust false-positive assessment of peptide and protein identifications, (ii) complex protein inference against a background of highly redundant data, (iii) taxonomic and functional post-processing of identification data, and finally, (iv) the assessment and provision of metrics and tools for quantitative analysis.

Keywords: Metaproteomics; metaproteogenomics; proteogenomics.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Data Analysis*
  • Databases, Protein
  • Humans
  • Metagenomics*
  • Proteome / metabolism
  • Proteomics*
  • Search Engine

Substances

  • Proteome