Imputation reliability on DNA biallelic markers for drug metabolism studies

BMC Bioinformatics. 2012;13 Suppl 14(Suppl 14):S7. doi: 10.1186/1471-2105-13-S14-S7. Epub 2012 Sep 7.

Abstract

Background: Imputation is a statistical process used to predict genotypes of loci not directly assayed in a sample of individuals. Our goal is to measure the performance of imputation in predicting the genotype of the best known gene polymorphisms involved in drug metabolism using a common SNP array genotyping platform generally exploited in genome wide association studies.

Methods: Thirty-nine (39) individuals were genotyped with both Affymetrix Genome Wide Human SNP 6.0 (AFFY) and Affymetrix DMET Plus (DMET) platforms. AFFY and DMET contain nearly 900000 and 1931 markers respectively. We used a 1000 Genomes Pilot + HapMap 3 reference panel. Imputation was performed using the computer program Impute, version 2. SNPs contained in DMET, but not imputed, were analysed studying markers around their chromosome regions. The efficacy of the imputation was measured evaluating the number of successfully imputed SNPs (SSNPs).

Results: The imputation predicted the genotypes of 654 SNPs not present in the AFFY array, but contained in the DMET array. Approximately 1000 SNPs were not annotated in the reference panel and therefore they could not be directly imputed. After testing three different imputed genotype calling threshold (IGCT), we observed that imputation performs at its best for IGCT value equal to 50%, with rate of SSNPs (MAF > 0.05) equal to 85%.

Conclusions: Most of the genes involved in drug metabolism can be imputed with high efficacy using standard genome-wide genotyping platforms and imputing procedures.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Economics, Pharmaceutical
  • Genetic Markers
  • Genome, Human
  • Genome-Wide Association Study*
  • HapMap Project
  • Humans
  • Leukemia, Myeloid, Acute / drug therapy*
  • Leukemia, Myeloid, Acute / genetics*
  • Pharmacogenetics / methods*
  • Polymorphism, Single Nucleotide*
  • Reproducibility of Results
  • Software
  • Statistics as Topic / methods*

Substances

  • Genetic Markers