Using CATH-Gene3D to Analyze the Sequence, Structure, and Function of Proteins

Ian Sillitoe; Tony Lewis; Christine Orengo

doi:10.1002/0471250953.bi0128s50

Using CATH-Gene3D to Analyze the Sequence, Structure, and Function of Proteins

Curr Protoc Bioinformatics. 2015 Jun 19:50:1.28.1-1.28.21. doi: 10.1002/0471250953.bi0128s50.

Authors

Ian Sillitoe¹, Tony Lewis¹, Christine Orengo¹

Affiliation

¹ University College London, London, United Kingdom.

PMID: 26087950
DOI: 10.1002/0471250953.bi0128s50

Abstract

The CATH database is a classification of protein structures found in the Protein Data Bank (PDB). Protein structures are chopped into individual units of structural domains, and these domains are grouped together into superfamilies if there is sufficient evidence that they have diverged from a common ancestor during the process of evolution. A sister resource, Gene3D, extends this information by scanning sequence profiles of these CATH domain superfamilies against many millions of known proteins to identify related sequences. Thus the combined CATH-Gene3D resource provides confident predictions of the likely structural fold, domain organisation, and evolutionary relatives of these proteins. In addition, this resource incorporates annotations from a large number of external databases such as known enzyme active sites, GO molecular functions, physical interactions, and mutations. This unit details how to access and understand the information contained within the CATH-Gene3D Web pages, the downloadable data files, and the remotely accessible Web services.

Keywords: functional family; protein classification; protein domain; protein structure; superfamily.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence
Databases, Protein*
Molecular Sequence Data
Protein Structure, Tertiary
Proteins / chemistry*
Search Engine

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding