Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing

PLoS One. 2009 Aug 17;4(8):e6659. doi: 10.1371/journal.pone.0006659.

Abstract

Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb) and 7 (1.1 Mb) from an individual from the International HapMap Project (NA12872). We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Chromosome Mapping
  • Chromosomes, Human, Pair 21
  • DNA / genetics*
  • Genetic Variation*
  • Genome, Human*
  • Humans
  • Polymorphism, Single Nucleotide

Substances

  • DNA