Combining microarray-based genomic selection (MGS) with the Illumina Genome Analyzer platform to sequence diploid target regions

Ann Hum Genet. 2009 Sep;73(Pt 5):502-13. doi: 10.1111/j.1469-1809.2009.00530.x. Epub 2009 Jul 1.

Abstract

Novel methods of targeted sequencing of unique regions from complex eukaryotic genomes have generated a great deal of excitement, but critical demonstrations of these methods efficacy with respect to diploid genotype calling and experimental variation are lacking. To address this issue, we optimized microarray-based genomic selection (MGS) for use with the Illumina Genome Analyzer (IGA). A set of 202 fragments (304 kb total) contained within a 1.7 Mb genomic region on human chromosome X were MGS/IGA sequenced in ten female HapMap samples generating a total of 2.4 GB of DNA sequence. At a minimum coverage threshold of 5X, 93.9% of all bases and 94.9% of segregating sites were called, while 57.7% of bases (57.4% of segregating sites) were called at a 50X threshold. Data accuracy at known segregating sites was 98.9% at 5X coverage, rising to 99.6% at 50X coverage. Accuracy at homozygous sites was 98.7% at 5X sequence coverage and 99.5% at 50X coverage. Although accuracy at heterozygous sites was modestly lower, it was still over 92% at 5X coverage and increased to nearly 97% at 50X coverage. These data provide the first demonstration that MGS/IGA sequencing can generate the very high quality sequence data necessary for human genetics research. All sequences generated in this study have been deposited in NCBI Short Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra, Accession # SRA007913).

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Black People / genetics
  • Chromosomes, Human, X / genetics
  • Diploidy
  • Female
  • Genome, Human*
  • Humans
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sequence Analysis, DNA / methods*
  • White People / genetics