Big problems in spatio-temporal disease mapping: Methods and software

Comput Methods Programs Biomed. 2023 Apr:231:107403. doi: 10.1016/j.cmpb.2023.107403. Epub 2023 Feb 3.

Abstract

Background and objective: Fitting spatio-temporal models for areal data is crucial in many fields such as cancer epidemiology. However, when data sets are very large, many issues arise. The main objective of this paper is to propose a general procedure to analyze high-dimensional spatio-temporal areal data, with special emphasis on mortality/incidence relative risk estimation.

Methods: We present a pragmatic and simple idea that permits hierarchical spatio-temporal models to be fitted when the number of small areas is very large. Model fitting is carried out using integrated nested Laplace approximations over a partition of the spatial domain. We also use parallel and distributed strategies to speed up computations in a setting where Bayesian model fitting is generally prohibitively time-consuming or even unfeasible.

Results: Using simulated and real data, we show that our method outperforms classical global models. We implement the methods and algorithms that we develop in the open-source R package bigDM where specific vignettes have been included to facilitate the use of the methodology for non-expert users.

Conclusions: Our scalable methodology proposal provides reliable risk estimates when fitting Bayesian hierarchical spatio-temporal models for high-dimensional data.

Keywords: Cancer epidemiology; Laplace approximations; Massive data; Non-stationary models; Scalable modelling.

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Incidence
  • Software*
  • Spatio-Temporal Analysis