Filtering strategies for improving protein identification in high-throughput MS/MS studies

Proteomics. 2009 Feb;9(4):848-60. doi: 10.1002/pmic.200800517.

Abstract

Despite the recent advances in streamlining high-throughput proteomic pipelines using tandem mass spectrometry (MS/MS), reliable identification of peptides and proteins on a larger scale has remained a challenging task, still involving a considerable degree of user interaction. Recently, a number of papers have proposed computational strategies both for distinguishing poor MS/MS spectra prior to database search (pre-filtering) as well as for verifying the peptide identifications made by the search programs (post-filtering). Both of these filtering approaches can be very beneficial to the overall protein identification pipeline, since they can remove a substantial part of the time consuming manual validation work and convert large sets of MS/MS spectra into more reliable and interpretable proteome information. The choice of the filtering method depends both on the properties of the data and on the goals of the experiment. This review discusses the different pre- and post-filtering strategies available to the researchers, together with their relative merits and potential pitfalls. We also highlight some additional research topics, such as spectral denoising and statistical assessment of the identification results, which aim at further improving the coverage and accuracy of high-throughput protein identification studies.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Cluster Analysis
  • Computer Simulation
  • Data Interpretation, Statistical
  • Fourier Analysis
  • Models, Statistical
  • Peptides / chemistry
  • Proteins / chemistry*
  • Proteomics / methods*
  • Tandem Mass Spectrometry / methods*

Substances

  • Peptides
  • Proteins