Multivariate Spatial Prediction of Air Pollutant Concentrations with INLA

Environ Res Commun. 2021 Oct;3(10):101002. doi: 10.1088/2515-7620/ac2f92. Epub 2021 Oct 27.

Abstract

Estimates of daily air pollution concentrations with complete spatial and temporal coverage are important for supporting epidemiologic studies and health impact assessments. While numerous approaches have been developed for modeling air pollution, they typically only consider each pollutant separately. We describe a spatial multipollutant data fusion model that combines monitoring measurements and chemical transport model simulations that leverages dependence between pollutants to improve spatial prediction. For the contiguous United States, we created a data product of daily concentration for 12 pollutants (CO, NOx, NO2, SO2, O3, PM10, and PM2.5 species EC, OC, NO3, NH4, SO4) during the period 2005 to 2014. Out-of-sample prediction showed good performance, particularly for daily PM2.5 species EC (R2 = 0.64), OC (R2 = 0.75), NH4 (R2 = 0.84), NO3 (R2 = 0.73), and SO4 (R2 = 0.80). By employing the integrated nested Laplace approximation (INLA) for Bayesian inference, our approach also provides model-based prediction error estimates. The daily data product at 12km spatial resolution will be publicly available immediately upon publication. To our knowledge this is the first publicly available data product for major PM2.5 species and several gases at this spatial and temporal resolution.

Keywords: CMAQ; Geostatistical model; INLA; Linear model of coregionalization; air pollution.