Computing the summed adjacency disruption number between two genomes with duplicate genes

J Comput Biol. 2010 Sep;17(9):1243-65. doi: 10.1089/cmb.2010.0098.

Abstract

The increasing number of fully sequenced genomes has led to the study of genome rearrangements. Several approaches have been proposed to solve this problem, all of them being either too complex to be solved efficiently or too simple to be applied to genomes of complex organisms. The latest challenge has been to overcome the problem of having genomes with duplicate genes. This led to the definition of matching models and similarity measures. The idea is to find a matching between genes in two genomes, in order to disambiguate the data of duplicate genes and calculate a similarity measure. The problem becomes that of finding a matching that best preserves the order of genes in two genomes, where gene order is evaluated by a chosen similarity measure. This article presents new algorithms for computing the exact summed adjacency disruption number for two genomes with duplicate genes. Experimental results on a γ-Proteobacteria data set illustrate the approach.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computational Biology / methods
  • Gammaproteobacteria / genetics
  • Gene Rearrangement*
  • Genes, Duplicate*
  • Genome*
  • Mathematical Concepts
  • Models, Genetic
  • Molecular Sequence Data