Delineation of modular proteins: domain boundary prediction from sequence information

Brief Bioinform. 2004 Jun;5(2):179-92. doi: 10.1093/bib/5.2.179.

Abstract

The delineation of domain boundaries of a given sequence in the absence of known 3D structures or detectable sequence homology to known domains benefits many areas in protein science, such as protein engineering, protein 3D structure determination and protein structure prediction. With the exponential growth of newly determined sequences, our ability to predict domain boundaries rapidly and accurately from sequence information alone is both essential and critical from the viewpoint of gene function annotation. Anyone attempting to predict domain boundaries for a single protein sequence is invariably confronted with a plethora of databases that contain boundary information available from the internet and a variety of methods for domain boundary prediction. How are these derived and how well do they work? What definition of 'domain' do they use? We will first clarify the different definitions of protein domains, and then describe the available public databases with domain boundary information. Finally, we will review existing domain boundary prediction methods and discuss their strengths and weaknesses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence*
  • Databases, Protein*
  • Evolution, Molecular
  • Models, Molecular
  • Protein Structure, Tertiary*
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism

Substances

  • Proteins