Removing noise and ambiguities from comparative maps in rearrangement analysis

IEEE/ACM Trans Comput Biol Bioinform. 2007 Oct-Dec;4(4):515-22. doi: 10.1109/TCBB.2007.1075.

Abstract

Comparison of genomic maps is hampered by errors and ambiguities introduced by mapping technology, incorrectly resolved paralogy, small samples of markers and extensive genome rearrangement. We design an analysis to remove or resolve most of these problems and to extract corrected data where markers occur in consecutive strips in both genomes. To do this we introduce the notion of pre-strip, an efficient way of generating these, and a compatibility analysis culminating in a Maximum Weighted Clique (MWC) search. The output can be directly analyzed with genome rearrangement algorithms, allowing the restoration of some of the data not incorporated into the clique solution. We investigate the trade-off between criteria for discarding excessive pre-strips to make MWC feasible, in terms of retaining as many markers as possible in the solution and producing an economical rearrangement analysis. We explore these questions through simulation and through comparison of the rice and sorghum genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chromosome Mapping
  • Chromosomes, Plant
  • Computational Biology / methods*
  • Computer Simulation
  • Data Interpretation, Statistical
  • Gene Rearrangement
  • Genome*
  • Genome, Plant
  • Models, Genetic
  • Models, Statistical
  • Oryza / genetics
  • Reproducibility of Results
  • Sorghum / genetics
  • Species Specificity