Markov Models for inferring copy number variations from genotype data on Illumina platforms

Hui Wang; Jan H Veldink; Hylke Blauw; Leonard H van den Berg; Roel A Ophoff; Chiara Sabatti

doi:10.1159/000210445

Markov Models for inferring copy number variations from genotype data on Illumina platforms

Hum Hered. 2009;68(1):1-22. doi: 10.1159/000210445. Epub 2009 Apr 1.

Authors

Hui Wang¹, Jan H Veldink, Hylke Blauw, Leonard H van den Berg, Roel A Ophoff, Chiara Sabatti

Affiliation

¹ Department of Biostatistics, University of California at Berkeley, Berkeley, CA 94720-7358, USA. hwangui@berkeley.edu

Abstract

Background/aims: Illumina genotyping arrays provide information on DNA copy number. Current methodology for their analysis assumes linkage equilibrium across adjacent markers. This is unrealistic, given the markers high density, and can result in reduced specificity. Another limitation of current methods is that they cannot be directly applied to the analysis of multiple samples with the goal of detecting copy number polymorphisms and their association with traits of interest.

Methods: We propose a new Hidden Markov Model for Illumina genotype data, that takes into account linkage disequilibrium between adjacent loci. Our framework also allows for location specific deletion/duplication rates. When multiple samples are available, we describe a methodology for their analysis that simultaneously reconstructs the copy number states in each sample and identifies genomic locations with increased variability in copy number in the population. This approach can be extended to test association between copy number variants and a disease trait.

Results and conclusions: We show that taking into account linkage disequilibrium between adjacent markers can increase the specificity of a HMM in reconstructing copy number variants, especially single copy deletions. Our multisample approach is computationally practical and can increase the power of association studies.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Gene Deletion
Gene Dosage*
Genotype
Humans
Linkage Disequilibrium
Markov Chains*
Polymorphism, Genetic*

Abstract

Publication types

MeSH terms

Grants and funding