Refinement of optical map assemblies

Bioinformatics. 2006 May 15;22(10):1217-24. doi: 10.1093/bioinformatics/btl063. Epub 2006 Feb 24.

Abstract

Motivation: Genomic mutations and variations provide insightful information about the functionality of sequence elements and their association with human diseases. Traditionally, variations are identified through analysis of short DNA sequences, usually shorter than 1000 bp per fragment. Optical maps provide both faster and more cost-efficient means for detecting such differences, because a single map can span over 1 million bp. Optical maps are assembled to cover the whole genome, and the accuracy of assembly is critical.

Results: We present a computationally efficient model-based method for improving quality of such assemblies. Our method provides very high accuracy even with moderate coverage (<20 x). We utilize a hidden Markov model to represent the consensus map and use the expectation-Maximization algorithm to drive the refinement process. We also provide quality scores to assess the quality of the finished map.

Availability: Code is available from www.cmb.usc.edu/people/valouev/

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Computer Simulation
  • DNA / chemistry*
  • DNA / genetics*
  • DNA / ultrastructure
  • DNA Mutational Analysis / methods*
  • Image Interpretation, Computer-Assisted / methods*
  • Microfluidic Analytical Techniques / methods
  • Microscopy, Fluorescence / methods*
  • Models, Genetic
  • Optics and Photonics
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*

Substances

  • DNA