Sequence Similarity Searching

Curr Protoc Protein Sci. 2019 Feb;95(1):e71. doi: 10.1002/cpps.71. Epub 2018 Aug 13.

Abstract

Sequence similarity searching has become an important part of the daily routine of molecular biologists, bioinformaticians and biophysicists. With the rapidly growing sequence databanks, this computational approach is commonly applied to determine functions and structures of unannotated sequences, to investigate relationships between sequences, and to construct phylogenetic trees. We introduce arguably the most popular BLAST-based family of the sequence similarity search tools. We explain basic concepts related to the sequence alignment and demonstrate how to search the current databanks using Web site versions of BLASTP, PSI-BLAST and BLASTN. We also describe the standalone BLAST+ tool. Moreover, this unit discusses the inputs, parameter settings and outputs of these tools. Lastly, we cover recent advances in the sequence similarity searching, focusing on the fast MMseqs2 method. © 2018 by John Wiley & Sons, Inc.

Keywords: BLAST; BLASTN; BLASTP; MMseqs2; PSSM; alignment; sequence similarity searching.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Databases, Protein*
  • Proteins / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins