Accurate Proteoform Identification and Quantitation Using pTop 2.0

Methods Mol Biol. 2022:2500:105-129. doi: 10.1007/978-1-0716-2325-1_9.

Abstract

The remarkable advancement of top-down proteomics in the past decade is driven by the technological development in separation, mass spectrometry (MS) instrumentation, novel fragmentation, and bioinformatics. However, the accurate identification and quantification of proteoforms, all clearly-defined molecular forms of protein products from a single gene, remain a challenging computational task. This is in part due to the complicated mass spectra from intact proteoforms when compared to those from the digested peptides. Herein, pTop 2.0 is developed to fill in the gap between the large-scale complex top-down MS data and the shortage of high-accuracy bioinformatic tools. Compared with pTop 1.0, the first version, pTop 2.0 concentrates mainly on the identification of the proteoforms with unexpected modifications or a terminal truncation. The quantitation based on isotopic labeling is also a new function, which can be carried out by the convenient and user-friendly "one-key operation," integrated together with the qualitative identifications. The accuracy and running speed of pTop 2.0 is significantly improved on the test data sets. This chapter will introduce the main features, step-by-step running operations, and algorithmic developments of pTop 2.0 in order to push the identification and quantitation of intact proteoforms to a higher-accuracy level in top-down proteomics.

Keywords: Proteoform identification and quantitation; Search engine; Semi-supervised learning; Tandem mass spectrometry; Top-down proteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Mass Spectrometry
  • Proteome* / metabolism
  • Proteomics* / methods

Substances

  • Proteome