VAR-tree model based spatio-temporal characterization and prediction of O3 concentration in China

Ecotoxicol Environ Saf. 2023 Jun 1:257:114960. doi: 10.1016/j.ecoenv.2023.114960. Epub 2023 Apr 26.

Abstract

Ozone (O3) pollution in the atmosphere is getting worse in many cities. In order to improve the accuracy of O3 prediction and obtain the spatial distribution of O3 concentration over a continuous period of time, this paper proposes a VAR-XGBoost model based on Vector autoregression (VAR), Kriging method and XGBoost (Extreme Gradient Boosting). China is used as an example and its spatial distribution of O3 is simulated. In this paper, the O3 concentration data of the monitoring sites in China are obtained, and then a spatial prediction method of O3 mass concentration based on the VAR-XGBoost model is established, and finnally its influencing factors are analyzed. This paper concludes that O3 features the highest correlation with PM2.5 and the lowest correlation with SO2. Among the measurement factors, wind speed and temperature are the most important factors affecting O3 pollution, which are positively correlated to O3 pollution. In addition, precipitation is negatively correlated with 8-hour ozone concentration. In this paper, the performance of the VAR-XGBoost model is evaluated based on the ten-fold cross-validation method of sample, site and time, and a comparison with the results of XGBoost, CatBoost (categorical boosting), ExtraTrees, GBDT (gradient boosting decision tree), AdaBoost (adaptive boosting), RF (random forest), Decision tree, and LightGBM (light gradient boosting machine) models is conducted. The result shows that the prediction accuracy of the VAR-XGBoost model is better than other models. The seasonal and annual average R2 reaches 0.94 (spring), 0.93 (summer), 0.92 (autumn), 0.93 (winter), and 0.95 (average from 2016 to 2021). The data show that the applicability of the VAR-XGBoost model in simulating the spatial distribution of O3 concentrations in China performs well. The spatial distribution of O3 concentrations in the Chinese region shows an obvious feature of high in the east and low in the west, and the spatial distribution is strongly influenced by topographical factors. The mean concentration is clearly low in winter and high in summer within a season. The results of this study can provide a scientific basis for the prevention and control of regional O3 pollution in China, and can also provide new ideas for the acquisition of data on the spatial distribution of O3 concentrations within cities.

Keywords: O(3); Spatio-temporal characteristics; Tree model; VAR; XGBoost.

MeSH terms

  • Air Pollutants* / analysis
  • Air Pollution* / analysis
  • China
  • Cities
  • Environmental Monitoring
  • Ozone* / analysis
  • Particulate Matter / analysis
  • Seasons

Substances

  • Air Pollutants
  • Ozone
  • Particulate Matter