Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows

Kenneth Verheggen; Helge Raeder; Frode S Berven; Lennart Martens; Harald Barsnes; Marc Vaudel

doi:10.1002/mas.21543

Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows

Mass Spectrom Rev. 2020 May;39(3):292-306. doi: 10.1002/mas.21543. Epub 2017 Sep 13.

Authors

Kenneth Verheggen^{1

2

3}, Helge Raeder^{4

5}, Frode S Berven⁶, Lennart Martens^{1

2

3}, Harald Barsnes^{4

6

7}, Marc Vaudel^{4

6

8}

Affiliations

¹ VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
² Department of Biochemistry, Ghent University, Ghent, Belgium.
³ Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.
⁴ KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway.
⁵ Department of Pediatrics, Haukeland University Hospital, Bergen, Norway.
⁶ Proteomics Unit, Department of Biomedicine, University of Bergen, Norway.
⁷ Computational Biology Unit, Department of Informatics, University of Bergen, Norway.
⁸ Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway.

PMID: 28902424
DOI: 10.1002/mas.21543

Abstract

Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.

Keywords: bioinformatics; proteomics; search engine.

Publication types

Research Support, Non-U.S. Gov't
Review

MeSH terms

Animals
Humans
Mass Spectrometry / methods*
Proteins / chemistry*
Proteomics / methods*
Search Engine / methods*
Workflow

Substances

Proteins