Optimizing the maximum reported cluster size for the multinomial-based spatial scan statistic

Int J Health Geogr. 2023 Nov 8;22(1):30. doi: 10.1186/s12942-023-00353-4.

Abstract

Background: Correctly identifying spatial disease cluster is a fundamental concern in public health and epidemiology. The spatial scan statistic is widely used for detecting spatial disease clusters in spatial epidemiology and disease surveillance. Many studies default to a maximum reported cluster size (MRCS) set at 50% of the total population when searching for spatial clusters. However, this default setting can sometimes report clusters larger than true clusters, which include less relevant regions. For the Poisson, Bernoulli, ordinal, normal, and exponential models, a Gini coefficient has been developed to optimize the MRCS. Yet, no measure is available for the multinomial model.

Results: We propose two versions of a spatial cluster information criterion (SCIC) for selecting the optimal MRCS value for the multinomial-based spatial scan statistic. Our simulation study suggests that SCIC improves the accuracy of reporting true clusters. Analysis of the Korea Community Health Survey (KCHS) data further demonstrates that our method identifies more meaningful small clusters compared to the default setting.

Conclusions: Our method focuses on improving the performance of the spatial scan statistic by optimizing the MRCS value when using the multinomial model. In public health and disease surveillance, the proposed method can be used to provide more accurate and meaningful spatial cluster detection for multinomial data, such as disease subtypes.

Keywords: Gini coefficient; Information criterion; Maximum scanning window size; SaTScan; Spatial cluster detection.

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Disease Outbreaks*
  • Humans
  • Models, Statistical*
  • Public Health