FunClust: a web server for the identification of structural motifs in a set of non-homologous protein structures

BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2105-9-S2-S2.

Abstract

Background: The occurrence of very similar structural motifs brought about by different parts of non homologous proteins is often indicative of a common function. Indeed, relatively small local structures can mediate binding to a common partner, be it a protein, a nucleic acid, a cofactor or a substrate. While it is relatively easy to identify short amino acid or nucleotide sequence motifs in a given set of proteins or genes, and many methods do exist for this purpose, much more challenging is the identification of common local substructures, especially if they are formed by non consecutive residues in the sequence.

Results: Here we describe a publicly available tool, able to identify common structural motifs shared by different non homologous proteins in an unsupervised mode. The motifs can be as short as three residues and need not to be contiguous or even present in the same order in the sequence. Users can submit a set of protein structures deemed or not to share a common function (e.g. they bind similar ligands, or share a common epitope). The server finds and lists structural motifs composed of three or more spatially well conserved residues shared by at least three of the submitted structures. The method uses a local structural comparison algorithm to identify subsets of similar amino acids between each pair of input protein chains and a clustering procedure to group similarities shared among different structure pairs.

Conclusions: FunClust is fast, completely sequence independent, and does not need an a priori knowledge of the motif to be found. The output consists of a list of aligned structural matches displayed in both tabular and graphical form. We show here examples of its usefulness by searching for the largest common structural motifs in test sets of non homologous proteins and showing that the identified motifs correspond to a known common functional feature.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Cluster Analysis*
  • Internet*
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods
  • Proteins / chemistry*
  • Proteins / ultrastructure
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Software*

Substances

  • Proteins