An easy-to-use Decoy Database Builder software tool, implementing different decoy strategies for false discovery rate calculation in automated MS/MS protein identifications

Proteomics. 2008 Mar;8(6):1129-37. doi: 10.1002/pmic.200701073.

Abstract

One of the major challenges for large scale proteomics research is the quality evaluation of results. Protein identification from complex biological samples or experimental setups is often a manual and subjective task which lacks profound statistical evaluation. This is not feasible for high-throughput proteomic experiments which result in large datasets of thousands of peptides and proteins and their corresponding mass spectra. To improve the quality, reliability and comparability of scientific results, an estimation of the rate of erroneously identified proteins is advisable. Moreover, scientific journals increasingly stipulate that articles containing considerable MS data should be subject to stringent statistical evaluation. We present a newly developed easy-to-use software tool enabling quality evaluation by generating composite target-decoy databases usable with all relevant protein search engines. This tool, when used in conjunction with relevant statistical quality criteria, enables to reliably determine peptides and proteins of high quality, even for nonexperienced users (e.g. laboratory staff, researchers without programming knowledge). Different strategies for building decoy databases are implemented and the resulting databases are characterized and compared. The quality of protein identification in high-throughput proteomics is usually measured by the false positive rate (FPR), but it is shown that the false discovery rate (FDR) delivers a more meaningful, robust and comparable value.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Proteome / analysis*
  • Proteomics / methods*
  • Software*
  • Tandem Mass Spectrometry / methods*
  • User-Computer Interface

Substances

  • Proteome