Reclaiming independence in spatial-clustering datasets: A series of data-driven spatial weights matrices

Stat Med. 2022 Jul 10;41(15):2939-2956. doi: 10.1002/sim.9395. Epub 2022 Mar 28.

Abstract

Most spatial models include a spatial weights matrix (W) derived from the first law of geography to adjust the spatial dependence to fulfill the independence assumption. In various fields such as epidemiological and environmental studies, the spatial dependence often shows clustering (or geographic discontinuity) due to natural or social factors. In such cases, adjustment using the first-law-of-geography-based W might be inappropriate and leads to inaccuracy estimations and loss of statistical power. In this work, we propose a series of data-driven Ws (DDWs) built following the spatial pattern identified by the scan statistic, which can be easily carried out using existing tools such as SaTScan software. The DDWs take both the clustering (or discontinuous) and the intuitive first-law-of-geographic-based spatial dependence into consideration. Aiming at two common purposes in epidemiology studies (ie, estimating the effect value of explanatory variable X and estimating the risk of each spatial unit in disease mapping), the common spatial autoregressive models and the Leroux-prior-based conditional autoregressive (CAR) models were selected to evaluate performance of DDWs, respectively. Both simulation and case studies show that our DDWs achieve considerably better performance than the classic W in datasets with clustering (or discontinuous) spatial dependence. Furthermore, the latest published density-based spatial clustering models, aiming at dealing with such clustering (or discontinuity) spatial dependence in disease mapping, were also compared as references. The DDWs, incorporated into the CAR models, still show considerable advantage, especially in the datasets for common diseases.

Keywords: clustering spatial dependence; conditional autoregressive model; disease mapping; spatial autoregressive model; spatial weights matrix.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Geography
  • Humans
  • Software*
  • Spatial Analysis