De Novo Peptide Sequencing: Deep Mining of High-Resolution Mass Spectrometry Data

Methods Mol Biol. 2017:1549:119-134. doi: 10.1007/978-1-4939-6740-7_10.

Abstract

High resolution mass spectrometry has revolutionized proteomics over the past decade, resulting in tremendous amounts of data in the form of mass spectra, being generated in a relatively short span of time. The mining of this spectral data for analysis and interpretation though has lagged behind such that potentially valuable data is being overlooked because it does not fit into the mold of traditional database searching methodologies. Although the analysis of spectra by de novo sequences removes such biases and has been available for a long period of time, its uptake has been slow or almost nonexistent within the scientific community. In this chapter, we propose a methodology to integrate de novo peptide sequencing using three commonly available software solutions in tandem, complemented by homology searching, and manual validation of spectra. This simplified method would allow greater use of de novo sequencing approaches and potentially greatly increase proteome coverage leading to the unearthing of valuable insights into protein biology, especially of organisms whose genomes have been recently sequenced or are poorly annotated.

Keywords: De novo peptide sequencing; Functional annotation; Hybrid peptide sequencing; MS evidence; MS validation.

MeSH terms

  • Computational Biology / methods
  • Data Mining / methods
  • Databases, Protein
  • Mass Spectrometry / methods
  • Mass Spectrometry / standards
  • Peptides* / chemistry
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods*
  • Software
  • Web Browser
  • Workflow

Substances

  • Peptides