Parking strategies for genome sequencing

Genome Res. 2000 Jul;10(7):1020-30. doi: 10.1101/gr.10.7.1020.

Abstract

The parking strategy is an iterative approach to DNA sequencing. Each iteration consists of sequencing a novel portion of target DNA that does not overlap any previously sequenced region. Subject to the constraint of no overlap, each new region is chosen randomly. A parking strategy is often ideal in the early stages of a project for rapidly generating unique data. As a project progresses, parking becomes progressively more expensive and eventually prohibitive. We present a mathematical model with a generalization to allow for overlaps. This model predicts multiple parameters, including progress, costs, and the distribution of gap sizes left by a parking strategy. The highly fragmented nature of the gaps left after an initial parking strategy may make it difficult to finish a project efficiently. Therefore, in addition to our parking model, we model gap closing by walking. Our gap-closing model is generalizable to many other strategies. Our discussion includes modified parking strategies and hybrids with other strategies. A hybrid parking strategy has been employed for portions of the Human Genome Project.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cloning, Molecular
  • Computer Simulation
  • Genome, Human
  • Genomic Library*
  • Humans
  • Models, Genetic*
  • Models, Statistical
  • Sequence Analysis, DNA / economics
  • Sequence Analysis, DNA / methods*
  • Sequence Tagged Sites