Just how versatile are domains?

BMC Evol Biol. 2008 Oct 14:8:285. doi: 10.1186/1471-2148-8-285.

Abstract

Background: Creating new protein domain arrangements is a frequent mechanism of evolutionary innovation. While some domains always form the same combinations, others form many different arrangements. This ability, which is often referred to as versatility or promiscuity of domains, its a random evolutionary model in which a domain's promiscuity is based on its relative frequency of domains.

Results: We show that there is a clear relationship across genomes between the promiscuity of a given domain and its frequency. However, the strength of this relationship differs for different domains. We thus redefine domain promiscuity by defining a new index, DV I ("domain versatility index"), which eliminates the effect of domain frequency. We explore links between a domain's versatility, when unlinked from abundance, and its biological properties.

Conclusion: Our results indicate that domains occurring as single domain proteins and domains appearing frequently at protein termini have a higher DV I. This is consistent with previous observations that the evolution of domain re-arrangements is primarily driven by fusion of pre-existing arrangements and single domains as well as loss of domains at protein termini. Furthermore, we studied the link between domain age, defined as the first appearance of a domain in the species tree, and the DV I. Contrary to previous studies based on domain promiscuity, it seems as if the DV I is age independent. Finally, we find that contrary to previously reported findings, versatility is lower in Eukaryotes. In summary, our measure of domain versatility indicates that a random attachment process is sufficient to explain the observed distribution of domain arrangements and that several views on domain promiscuity need to be revised.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Archaea / genetics
  • Bacteria / genetics
  • Computational Biology
  • Databases, Protein
  • Evolution, Molecular*
  • Genome
  • Mice
  • Models, Genetic*
  • Principal Component Analysis
  • Protein Interaction Domains and Motifs / genetics*