A Hypothesis Test for Detecting Distance-Specific Clustering and Dispersion in Areal Data

Spat Stat. 2023 Jun:55:100757. doi: 10.1016/j.spasta.2023.100757. Epub 2023 May 19.

Abstract

Spatial clustering detection has a variety of applications in diverse fields, including identifying infectious disease outbreaks, pinpointing crime hotspots, and identifying clusters of neurons in brain imaging applications. Ripley's K-function is a popular method for detecting clustering (or dispersion) in point process data at specific distances. Ripley's K-function measures the expected number of points within a given distance of any observed point. Clustering can be assessed by comparing the observed value of Ripley's K-function to the expected value under complete spatial randomness. While performing spatial clustering analysis on point process data is common, applications to areal data commonly arise and need to be accurately assessed. Inspired by Ripley's K-function, we develop the positive area proportion function (PAPF) and use it to develop a hypothesis testing procedure for the detection of spatial clustering and dispersion at specific distances in areal data. We compare the performance of the proposed PAPF hypothesis test to that of the global Moran's I statistic, the Getis-Ord general G statistic, and the spatial scan statistic with extensive simulation studies. We then evaluate the real-world performance of our method by using it to detect spatial clustering in land parcels containing conservation easements and US counties with high pediatric overweight/obesity rates.

Keywords: Ripley’s K-function; areal data; cluster detection; clustering.