Precursor deconvolution error estimation: The missing puzzle piece in false discovery rate in top-down proteomics

Proteomics. 2024 Feb;24(3-4):e2300068. doi: 10.1002/pmic.202300068. Epub 2023 Nov 23.

Abstract

Top-down proteomics (TDP) directly analyzes intact proteins and thus provides more comprehensive qualitative and quantitative proteoform-level information than conventional bottom-up proteomics (BUP) that relies on digested peptides and protein inference. While significant advancements have been made in TDP in sample preparation, separation, instrumentation, and data analysis, reliable and reproducible data analysis still remains one of the major bottlenecks in TDP. A key step for robust data analysis is the establishment of an objective estimation of proteoform-level false discovery rate (FDR) in proteoform identification. The most widely used FDR estimation scheme is based on the target-decoy approach (TDA), which has primarily been established for BUP. We present evidence that the TDA-based FDR estimation may not work at the proteoform-level due to an overlooked factor, namely the erroneous deconvolution of precursor masses, which leads to incorrect FDR estimation. We argue that the conventional TDA-based FDR in proteoform identification is in fact protein-level FDR rather than proteoform-level FDR unless precursor deconvolution error rate is taken into account. To address this issue, we propose a formula to correct for proteoform-level FDR bias by combining TDA-based FDR and precursor deconvolution error rate.

Keywords: FDR; deconvolution; false discovery rate; precursor; top-down proteomics.

MeSH terms

  • DNA-Binding Proteins
  • Peptides*
  • Proteomics*

Substances

  • Peptides
  • DNA-Binding Proteins