Gene team tree: a hierarchical representation of gene teams for all gap lengths

J Comput Biol. 2009 Oct;16(10):1383-98. doi: 10.1089/cmb.2009.0093.

Abstract

The identification of spatially co-located gene clusters is an important step towards understanding genome evolution and function. Gene team is a popular model for conserved gene clusters that constrains the maximum distance between adjacent genes in the same cluster. Existing algorithms for finding gene teams require the specification of the maximum allowed distance, delta. However, determining suitable values of delta is non-trivial, due to varying rates of rearrangement and differences in the distribution of genes across multiple genomes. Instead of trying to determine a single best value of delta, we propose constructing the Gene Team Tree, a compact representation of gene teams for all values of delta. The teams computed can then be verified/scored using application specific methods. Our algorithm for computing the GTT extends existing gene team mining algorithms without increasing their time complexity. We compute the GTT for E. coli K-12 and B. subtilis and show that E. coli K-12 operons are modelled by gene teams with different values of delta. We demonstrate the scalability of our method and the trade-off involved when comparing more than two genomes, through a comparative study using five gamma-proteobacteria genomes. Lastly, we describe how to compute the GTT for multi-chromosomal genomes and illustrate by computing the GTT for the human and mouse genomes. An implementation of the algorithms described in this article and the datasets used in the experiments can be downloaded from http://www.comp.nus.edu.sg/~leonghw/GTT .

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Bacillus subtilis / genetics
  • Escherichia coli / genetics
  • Evolution, Molecular
  • Gammaproteobacteria / classification
  • Gammaproteobacteria / genetics
  • Genome
  • Humans
  • Mice
  • Models, Genetic*
  • Multigene Family*
  • Operon
  • Phylogeny