The use and misuse of regression models in landscape genetic analyses

Mol Ecol. 2021 Jan;30(1):37-47. doi: 10.1111/mec.15716. Epub 2020 Dec 2.

Abstract

The field of landscape genetics has been rapidly evolving, adopting and adapting analytical frameworks to address research questions. Current studies are increasingly using regression-based frameworks to infer the individual contributions of landscape and habitat variables on genetic differentiation. This paper outlines appropriate and inappropriate uses of multiple regression for these purposes, and demonstrates through simulation the limitations of different analytical frameworks for making correct inference. Of particular concern are recent studies seeking to explain genetic differences by fitting regression models with effective distance variables calculated independently on separate landscape resistance surfaces. When moving across the landscape, organisms cannot respond independently and uniquely to habitat and landscape features. Analyses seeking to understand how landscape features affect gene flow should model a single conductance or resistance surface as a parameterized function of relevant spatial covariates, and estimate the values of these parameters by linking a single set of resistance distances to observed genetic dissimilarity via a loss function. While this loss function may involve a regression-like step, the associated nuisance parameters are not interpretable in terms of organismal movement and should not be conflated with what is actually of interest: the mapping between spatial covariates and conductance/resistance. The growth and evolution of landscape genetics as a field has been rapid and exciting. It is the goal of this paper to highlight past missteps and demonstrate limitations of current approaches to ensure that future use of regression models will appropriately consider the process being modeled, which will provide clarity to model interpretation.

Keywords: landscape genetics; landscape genomics; landscape resistance; maximum likelihood population-effects; multiple regression on distance matrices; simulation.

MeSH terms

  • Ecosystem
  • Gene Flow
  • Genetic Drift
  • Genetics, Population*
  • Models, Genetic*

Associated data

  • figshare/10.6084/m9.figshare.12844394.v1