AuberGene--a sensitive genome alignment tool

Bioinformatics. 2006 Jun 15;22(12):1431-6. doi: 10.1093/bioinformatics/btl112. Epub 2006 Apr 10.

Abstract

Motivation: The accumulation of genome sequences will only accelerate in the coming years. We aim to use this abundance of data to improve the quality of genomic alignments and devise a method which is capable of detecting regions evolving under weak or no evolutionary constraints.

Results: We describe a genome alignment program AuberGene, which explores the idea of transitivity of local alignments. Assessment of the program was done based on a 2 Mbp genomic region containing the CFTR gene of 13 species. In this region, we can identify 53% of human sequence sharing common ancestry with mouse, as compared with 44% found using the usual pairwise alignment. Between human and tetraodon 93 orthologous exons are found, as compared with 77 detected by the pairwise human-tetraodon comparison. AuberGene allows the user to (1) identify distant, previously undetected, conserved orthogonal regions such as ORFs or regulatory regions; (2) identify neutrally evolving regions in related species which are often overlooked by other alignment programs; (3) recognize false orthologous genomic regions. The increased sensitivity of the method is not obtained at the cost of reduced specificity. Our results suggest that, over the CFTR region, human shares 10% more sequence with mouse than previously thought ( approximately 50%, instead of 40% found with the pairwise alignment).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • Evolution, Molecular
  • Exons
  • Humans
  • Mice
  • Open Reading Frames
  • Phylogeny
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods
  • Software
  • Species Specificity