Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach

Curr Protein Pept Sci. 2010 Nov;11(7):538-49. doi: 10.2174/138920310794109148.

Abstract

SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Data Mining*
  • Databases, Protein*
  • Humans
  • Neural Networks, Computer
  • Online Systems
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / classification
  • ROC Curve
  • Sequence Homology, Amino Acid

Substances

  • Proteins