Controlling Population Structure in Human Genetic Association Studies with Samples of Unrelated Individuals

Stat Interface. 2011;4(3):317-326. doi: 10.4310/sii.2011.v4.n3.a6.

Abstract

In genetic studies, associations between genotypes and phenotypes may be confounded by unrecognized population structure and/or admixture. Studies have shown that even in European populations, which are thought to be relatively homogeneous, population stratification exists and can affect the validity of association studies. A number of methods have been proposed to address this issue in recent years. Among them, the mixed-model based approach and the principal component-based approach have several advantages over other methods. However, these approaches have not been thoroughly evaluated on large human datasets. The objectives of this study are to (1) evaluate and compare the performance of the mixed-model approach and the principal component-based approach for genetic association mapping using human data consisting of unrelated individuals, and (2) understand the relationship between these two approaches. To achieve these goals, we simulate datasets based on the HapMap data under various scenarios. Our results indicate that the mixed-model approach performs well in controlling for population structure/admixture. It has similar performance as that based on principal component analysis. However, the approach combining mixed-model and principal component analysis does not perform as well as either method itself.