[A Comparison Study on Multiple Modeling Approaches for Air Pollutant Geographic Model Development in Shanghai]

Huan Jing Ke Xue. 2023 Oct 8;44(10):5370-5381. doi: 10.13227/j.hjkx.202211045.
[Article in Chinese]

Abstract

Geostatistical models have been widely used in the exposure assessment of ambient air pollutants. However, few studies have focused on comparisons of modeling approaches and their prediction results. Here, we collected the NO2 and PM2.5 monitoring data from 55 sites in Shanghai from 2016 to 2019 and the geographic variables, such as road network, points of interest of emission locations, and satellite data were included. We used partial least squares regression (PLS), supervised linear regression (SLR), and random forest (RF) algorithms to develop spatial models and used ordinary kriging (OK) to develop a two-step model. We evaluated the models using a 5-fold cross validation method and selected the best model structure for each modeling approach between one-or two-step models that had been developed with or without OK. The results revealed that the best NO2 models were the RF-OK (Rmse2 was 0.70-0.82) and PLS-OK (Rmse2 was 0.78-0.84) models; the PLS model for PM2.5(Rmse2 was 0.62-0.71) outperformed the other PM2.5 models. We used the best models to predict annual exposures in Shanghai at a 1 km spatial scale and conducted the correlation analysis among the predictions of the best models. The results demonstrated that the NO2 predictions had higher correlation coefficients (r was 0.82-0.91) compared with those of the PM2.5 models (r was 0.66-0.96). Based on the exposure results predicted using the three models in 2019, we evaluated the cumulative population exposure concentrations for NO2 and PM2.5 in Shanghai.

Keywords: NO2; PM2.5; geostatistical model; partial least squares regression(PLS); random forest(RF).

Publication types

  • English Abstract