The spectral networks paradigm in high throughput mass spectrometry

Mol Biosyst. 2012 Oct;8(10):2535-44. doi: 10.1039/c2mb25085c.

Abstract

High-throughput proteomics is made possible by a combination of modern mass spectrometry instruments capable of generating many millions of tandem mass (MS(2)) spectra on a daily basis and the increasingly sophisticated associated software for their automated identification. Despite the growing accumulation of collections of identified spectra and the regular generation of MS(2) data from related peptides, the mainstream approach for peptide identification is still the nearly two decades old approach of matching one MS(2) spectrum at a time against a database of protein sequences. Moreover, database search tools overwhelmingly continue to require that users guess in advance a small set of 4-6 post-translational modifications that may be present in their data in order to avoid incurring substantial false positive and negative rates. The spectral networks paradigm for analysis of MS(2) spectra differs from the mainstream database search paradigm in three fundamental ways. First, spectral networks are based on matching spectra against other spectra instead of against protein sequences. Second, spectral networks find spectra from related peptides even before considering their possible identifications. Third, spectral networks determine consensus identifications from sets of spectra from related peptides instead of separately attempting to identify one spectrum at a time. Even though spectral networks algorithms are still in their infancy, they have already delivered the longest and most accurate de novo sequences to date, revealed a new route for the discovery of unexpected post-translational modifications and highly-modified peptides, enabled automated sequencing of cyclic non-ribosomal peptides with unknown amino acids and are now defining a novel approach for mapping the entire molecular output of biological systems that is suitable for analysis with tandem mass spectrometry. Here we review the current state of spectral networks algorithms and discuss possible future directions for automated interpretation of spectra from any class of molecules.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Automation, Laboratory
  • Bacillus subtilis / chemistry
  • Cataract / metabolism
  • Databases, Protein
  • Humans
  • Lens, Crystalline / chemistry
  • Molecular Sequence Data
  • Oligonucleotides / analysis
  • Oligonucleotides / chemistry*
  • Peptides, Cyclic / analysis
  • Peptides, Cyclic / chemistry*
  • Protein Processing, Post-Translational
  • Proteomics / methods*
  • Software*
  • Tandem Mass Spectrometry / methods*

Substances

  • Oligonucleotides
  • Peptides, Cyclic