We tested the use of Generalized Linear Mixed Models to detect associations between genetic loci and environmental variables, taking into account the population structure of sampled individuals. We used a simulation approach to generate datasets under demographically and selectively explicit models. These datasets were used to analyze and optimize GLMM capacity to detect the association between markers and selective coefficients as environmental data in terms of false and true positive rates. Different sampling strategies were tested, maximizing the number of populations sampled, sites sampled per population, or individuals sampled per site, and the effect of different selective intensities on the efficiency of the method was determined. Finally, we apply these models to an Arabidopsis thaliana SNP dataset from different accessions, looking for loci associated with spring minimal temperature. We identified 25 regions that exhibit unusual correlations with the climatic variable and contain genes with functions related to temperature stress.
Keywords: Adaptation; GLMM; Genome scan; SNP.
Copyright © 2014 Elsevier Inc. All rights reserved.