An interpretable self-adaptive deep neural network for estimating daily spatially-continuous PM2.5 concentrations across China

Sci Total Environ. 2021 May 10:768:144724. doi: 10.1016/j.scitotenv.2020.144724. Epub 2021 Jan 5.

Abstract

Accurate estimation of daily spatially-continuous PM2.5 (fine particulate matter) concentration is a prerequisite to address environmental public health issues, and satellite-based aerosol optical depth (AOD) products have been widely used to estimate PM2.5 concentrations using statistical-based or machine learning-based models. However, statistical-based models oversimplify the AOD-PM2.5 relationships, whereas complex machine learning technologies ignore the spatiotemporal heterogeneity of the predictors and demonstrate shortage in interpretation. Besides, large AOD data gaps resulting in PM2.5 estimation biases have been seldom imputed in previous studies, especially at national scales. To fill the above research gaps, this study attempts to present a feasible methodology to estimate daily spatially-continuous PM2.5 concentrations in China. The AOD data gaps across China were first imputed via a random forest (RF) model. Then, an interpretable self-adaptive deep neural network (SADNN) model, incorporating AOD, meteorological and other auxiliary predictors, was developed to estimate daily spatially-continuous PM2.5 concentrations from 2017 to 2018. Five-fold sample (site)-based cross-validation results showed a high accuracy of the SADNN model, with coefficient of determination and root mean square error values equal to 0.86 (0.84) and 13.07 (14.30) μg/m3, respectively, outperforming the standard DNN and the RF model. Furthermore, the SADNN model identified the spatiotemporal patterns of predictor importance, and demonstrated that the boundary layer height, elevation and AOD were the most important predictors both spatially and temporally. And the predictor importance in the Qinghai-Tibet Plateau was different from that in the rest of China. These results enhance our understanding of AOD-PM2.5 relationships and elucidate the estimated PM2.5 datasets with complete coverage are applicable for related air pollution studies and epidemiological cohort studies. Moreover, considering the effective nonlinear model capability and interpretability, the SADNN model is beneficial for not only PM2.5 estimation but also other earth data and scenarios.

Keywords: Aerosol optical depth (AOD); Attention module; Deep learning; Gap-filling; Particulate matter; Predictor importance.