The UniMarker (UM) method for synteny mapping of large genomes

Bioinformatics. 2004 Nov 22;20(17):3156-65. doi: 10.1093/bioinformatics/bth380. Epub 2004 Jun 24.

Abstract

Motivation: Synteny mapping, or detecting regions that are orthologous between two genomes, is a key step in studies of comparative genomics. For completely sequenced genomes, this is increasingly accomplished by whole-genome sequence alignment. However, such methods are computationally expensive, especially for large genomes, and require rather complicated post-processing procedures to filter out non-orthologous sequence matches.

Results: We have developed a novel method that does not require sequence alignment for synteny mapping of two large genomes, such as the human and mouse. In this method, the occurrence spectra of genome-wide unique 16mer sequences present in both the human and mouse genome are used to directly detect orthologous genomic segments. Being sequence alignment-free, the method is very fast and able to map the two mammalian genomes in one day of computing time on a single Pentium IV personal computer. The resulting human-mouse synteny map was shown to be in excellent agreement with those produced by the Mouse Genome Sequencing Consortium (MGSC) and by the Ensembl team; furthermore, the syntenic relationship of segments found only by our method was supported by BLASTZ sequence alignment.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Animals
  • Chromosome Mapping / methods*
  • Chromosomes, Human, Pair 16 / genetics*
  • Conserved Sequence
  • Evolution, Molecular
  • Genome, Human
  • Genomic Islands / genetics
  • Humans
  • Mice
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software*
  • Species Specificity
  • User-Computer Interface*