Bayesian nonparametric model for the validation of peptide identification in shotgun proteomics

Mol Cell Proteomics. 2009 Mar;8(3):547-57. doi: 10.1074/mcp.M700558-MCP200. Epub 2008 Nov 12.

Abstract

Tandem mass spectrometry combined with database searching allows high throughput identification of peptides in shotgun proteomics. However, validating database search results, a problem with a lot of solutions proposed, is still advancing in some aspects, such as the sensitivity, specificity, and generalizability of the validation algorithms. Here a Bayesian nonparametric (BNP) model for the validation of database search results was developed that incorporates several popular techniques in statistical learning, including the compression of feature space with a linear discriminant function, the flexible nonparametric probability density function estimation for the variable probability structure in complex problem, and the Bayesian method to calculate the posterior probability. Importantly the BNP model is compatible with the popular target-decoy database search strategy naturally. We tested the BNP model on standard proteins and real, complex sample data sets from multiple MS platforms and compared it with Peptide-Prophet, the cutoff-based method, and a simple nonparametric method (proposed by us previously). The performance of the BNP model was shown to be superior for all data sets searched on sensitivity and generalizability. Some high quality matches that had been filtered out by other methods were detected and assigned with high probability by the BNP model. Thus, the BNP model could be able to validate the database search results effectively and extract more information from MS/MS data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Databases, Protein
  • Humans
  • Mass Spectrometry
  • Models, Statistical*
  • Peptides / analysis*
  • Proteomics / methods*
  • Reproducibility of Results
  • Saccharomyces cerevisiae / metabolism
  • Statistics, Nonparametric

Substances

  • Peptides