Computing motif correlations in proteins

J Comput Chem. 2003 Dec;24(16):2032-43. doi: 10.1002/jcc.10332.

Abstract

Protein motifs, which are specific regions and conserved regions, are found by comparing multiple protein sequences. These conserved regions in general play an important role in protein functions and protein folds, for example, for their binding properties or enzymatic activities. The aim here is to find the existence correlations of protein motifs. The knowledge of protein motif/domain sharing should be important in shedding new light on the biologic functions of proteins and offering a basis in analyzing the evolution in the human genome or other genomes. The protein sequences used here are obtained from the PIR-NREF database and the protein motifs are retrieved from the PROSITE database. We apply data mining approach to discover the occurrence correlations of motif in protein sequences. The correlation of motifs mined can be used in evolution analyses and protein structure prediction. We discuss the latter, i.e., protein structure prediction in this study. The correlations mined are stored and maintained in a database system. The database is now available at http://bioinfo.csie.ncu.edu.tw/ProMotif/.

MeSH terms

  • Algorithms
  • Amino Acid Motifs*
  • Amino Acid Sequence
  • Animals
  • Computational Biology / methods*
  • Computer Graphics
  • Conserved Sequence
  • Databases, Protein
  • Humans
  • Mathematical Computing
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteomics / methods*
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Statistics as Topic
  • Structural Homology, Protein
  • User-Computer Interface

Substances

  • Proteins