The SBASE domain sequence resource, release 12: prediction of protein domain-architecture using support vector machines

Kristian Vlahovicek; László Kaján; Vilmos Agoston; Sándor Pongor

doi:10.1093/nar/gki112

The SBASE domain sequence resource, release 12: prediction of protein domain-architecture using support vector machines

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D223-5. doi: 10.1093/nar/gki112.

Authors

Kristian Vlahovicek¹, László Kaján, Vilmos Agoston, Sándor Pongor

Affiliation

¹ ICGEB-International Center for Genetic Engineering and Biotechnology, Area Science Park, 34012 Trieste, Italy.

Abstract

SBASE (http://www.icgeb.trieste.it/sbase) is an online resource designed to facilitate the detection of domain homologies based on sequence database search. The present release of the SBASE A library of protein domain sequences contains 972,397 protein sequence segments annotated by structure, function, ligand-binding or cellular topology, clustered into 8547 domain groups. SBASE B contains 169,916 domain sequences clustered into 2526 less well-characterized groups. Domain prediction is based on an evaluation of database search results in comparison with a 'similarity network' of inter-sequence similarity scores, using support vector machines trained on similarity search results of known domains.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence*
Databases, Protein*
Protein Structure, Tertiary*
Proteins / chemistry
Sequence Alignment
Sequence Analysis, Protein*

Substances

Proteins