Constructing a spatiotemporally coherent long-term PM2.5 concentration dataset over China during 1980-2019 using a machine learning approach

Huimin Li; Yang Yang; Hailong Wang; Baojie Li; Pinya Wang; Jiandong Li; Hong Liao

doi:10.1016/j.scitotenv.2020.144263

Constructing a spatiotemporally coherent long-term PM_2.5 concentration dataset over China during 1980-2019 using a machine learning approach

Sci Total Environ. 2021 Apr 15:765:144263. doi: 10.1016/j.scitotenv.2020.144263. Epub 2020 Dec 24.

Authors

Huimin Li¹, Yang Yang², Hailong Wang³, Baojie Li¹, Pinya Wang¹, Jiandong Li¹, Hong Liao¹

Affiliations

¹ Jiangsu Key Laboratory of Atmospheric Environment Monitoring and Pollution Control, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China.
² Jiangsu Key Laboratory of Atmospheric Environment Monitoring and Pollution Control, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China. Electronic address: yang.yang@nuist.edu.cn.
³ Atmospheric Sciences and Global Change Division, Pacific Northwest National Laboratory, Richland, WA, USA.

PMID: 33385811
DOI: 10.1016/j.scitotenv.2020.144263

Abstract

The lack of long-term observations and satellite retrievals of health-damaging fine particulate matter in China has demanded the estimates of historical PM_2.5 (particulate matter less than 2.5 μm in diameter) concentrations. This study constructs a gridded near-surface PM_2.5 concentration dataset across China covering 1980-2019 using the space-time random forest model with atmospheric visibility observations and other auxiliary data. The modeled daily PM_2.5 concentrations are in excellent agreement with ground measurements, with a coefficient of determination of 0.95 and mean relative error of 12%. Besides the atmospheric visibility which explains 30% of total importance of variables in the model, emissions and meteorological conditions are also key factors affecting PM_2.5 predictions. From 1980 to 2014, the model-predicted PM_2.5 concentrations increased constantly with the maximum growth rate of 5-10 μg/m³/decade over eastern China. Due to the clean air actions, PM_2.5 concentrations have decreased effectively at a rate over 50 μg/m³/decade in the North China Plain and 20-50 μg/m³/decade over many regions of China during 2014-2019. The newly generated dataset of 1-degree gridded PM_2.5 concentrations for the past 40 years across China provides a useful means for investigating interannual and decadal environmental and climate impacts related to aerosols.

Keywords: Atmospheric visibility; Clean air actions; Fine particulate matter; Space-time random forest model; Spatial and temporal variation.