On the sequencing of the human genome

Proc Natl Acad Sci U S A. 2002 Mar 19;99(6):3712-6. doi: 10.1073/pnas.042692499. Epub 2002 Mar 5.

Abstract

Two recent papers using different approaches reported draft sequences of the human genome. The international Human Genome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics adopted the whole-genome shotgun (WGS) approach. Here, we analyze whether the latter paper provides a meaningful test of the WGS approach on a mammalian genome. In the Celera paper, the authors did not analyze their own WGS data. Instead, they decomposed the HGP's assembled sequence into a "perfect tiling path", combined it with their WGS data, and assembled the merged data set. To study the implications of this approach, we perform computational analysis and find that a perfect tiling path with 2-fold coverage is sufficient to recover virtually the entirety of a genome assembly. We also examine the manner in which the assembly was anchored to the human genome and conclude that the process primarily depended on the HGP's sequence-tagged site maps, BAC maps, and clone-based sequences. Our analysis indicates that the Celera paper provides neither a meaningful test of the WGS approach nor an independent sequence of the human genome. Our analysis does not imply that a WGS approach could not be successfully applied to assemble a draft sequence of a large mammalian genome, but merely that the Celera paper does not provide such evidence.

MeSH terms

  • Chromosomes, Artificial, Bacterial / genetics
  • Chromosomes, Human, Pair 22 / genetics
  • Cloning, Molecular
  • Computational Biology / methods*
  • Computer Simulation
  • Genome, Human*
  • Genomics / methods
  • Human Genome Project*
  • Humans
  • Models, Genetic
  • Physical Chromosome Mapping / methods*
  • Physical Chromosome Mapping / standards
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*
  • Sequence Tagged Sites