Machine learning and deep learning modeling and simulation for predicting PM2.5 concentrations

Chemosphere. 2022 Dec;308(Pt 1):136353. doi: 10.1016/j.chemosphere.2022.136353. Epub 2022 Sep 6.

Abstract

Particulate matter (PM) pollution greatly endanger human physical and mental health, and it is of great practical significance to predict PM concentrations accurately. This study measured one-year monitoring data of six main meteorological parameters and PM2.5 concentrations independently at two monitoring sites in central China's Hunan Province. These datasets were then employed to train, validate, and evaluate the proposed extreme gradient boosting (XGBoost) machine learning model and the fully connected neural network deep learning model, respectively. The performances of the two models were compared, analyzed, and optimized through model parameter tuning. The XGBoost model had better prediction ability with R2 higher than 0.761 in the complete test dataset. When the complete dataset was divided into stratified sub-sets by daytime-nighttime periods, the value of R2 increased to 0.856 in the nighttime test dataset. The feature importance and influential mechanism of meteorological variables on PM2.5 concentrations were analyzed and discussed.

Keywords: Deep learning; Machine learning; PM2.5; XGBoost.

MeSH terms

  • Air Pollutants* / analysis
  • Deep Learning*
  • Environmental Monitoring
  • Humans
  • Machine Learning
  • Particulate Matter / analysis

Substances

  • Air Pollutants
  • Particulate Matter