Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows

Mass Spectrom Rev. 2020 May;39(3):292-306. doi: 10.1002/mas.21543. Epub 2017 Sep 13.

Abstract

Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.

Keywords: bioinformatics; proteomics; search engine.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Humans
  • Mass Spectrometry / methods*
  • Proteins / chemistry*
  • Proteomics / methods*
  • Search Engine / methods*
  • Workflow

Substances

  • Proteins