Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach

Somdutta Dhir; Mircea Pacurar; Dino Franklin; Zoltán Gáspári; Attila Kertész-Farkas; András Kocsor; Frank Eisenhaber; Sándor Pongor

doi:10.2174/138920310794109148

Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach

Curr Protein Pept Sci. 2010 Nov;11(7):538-49. doi: 10.2174/138920310794109148.

Authors

Somdutta Dhir¹, Mircea Pacurar, Dino Franklin, Zoltán Gáspári, Attila Kertész-Farkas, András Kocsor, Frank Eisenhaber, Sándor Pongor

Affiliation

¹ Protein Structure and Bioinformatics, ICGEB, Trieste, Italy.

PMID: 20887262
DOI: 10.2174/138920310794109148

Abstract

SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

Publication types

Review

MeSH terms

Algorithms
Data Mining*
Databases, Protein*
Humans
Neural Networks, Computer
Online Systems
Protein Structure, Tertiary
Proteins / chemistry*
Proteins / classification
ROC Curve
Sequence Homology, Amino Acid

Substances

Proteins