The influence of spatial patterning on modeling PM2.5 constituents in Eastern Massachusetts

Sci Total Environ. 2019 Sep 10:682:247-258. doi: 10.1016/j.scitotenv.2019.05.012. Epub 2019 May 5.

Abstract

Geostatistical exposure methods for air pollution have inherent uncertainties, resulting in varying levels of exposure misclassification. In this study, we propose that areas representing clusters of PM2.5 elements are potential predictor variables to be included in spatial models for particle composition. The inclusion of these clusters may minimize the exposure misclassification. We evaluated the influence of spatial patterning on modeling of 10 components of ambient PM2.5, which included Al, Cu, Fe, K, Ni, Pb, S, Ti, V, and Zn. This study was performed in three stages. First, we applied a hybrid approach (combination of Empirical Bayesian Kriging and land use regression) to estimate spatial variability for each one of the 10 components of ambient PM2.5. In this stage, we accounted for numerous predictors representing land use, transportation, demographic, and geographical characteristics. In the second stage, we applied the same hybrid approach adding clusters of each PM2.5 component to the set of predictor variables. The clusters here were estimated by a multivariate clustering approach based on k means. Finally, in the last stage, we compared the estimates obtained from the model without clusters (first stage) and the model with clusters (second stage). Overall, our findings suggest significant influence of spatial clusters on modeling some PM2.5 components. We observed that the clusters may affect the error of the prediction values and especially the proportion of explained variance for most of the PM2.5 constituents evaluated in this study. The model with cluster presented a better performance for all PM2.5 components, except for Pb, which the R2 value decreased 8.51% when we included the clusters in the analysis; and for V, which the R2 value did not change with the clusters. Models for Cu and Fe explained the highest concentration variance. The R2 value for the model without cluster was 0.55 for both pollutants. When we accounted for clusters, R2 value increased 13 and 7% for Cu (R2 = 0.62) and Fe (R2 = 0.59), respectively. The models for K and S presented the lowest performance for both models with and without cluster (although the model with cluster improved substantially the R2 values). Better knowledge of the influence of spatial patterns on air pollution modeling should be of interest to policy makers to devise future strategies to improve human exposure assessment to air particulates while controlling for spatial patterns of ambient PM2.5 elemental concentration.

Keywords: Air pollution; Cluster analysis; Geostatistical interpolation; PM(2.5) components.