The explainable potential of coupling hybridized metaheuristics, XGBoost, and SHAP in revealing toluene behavior in the atmosphere

Sci Total Environ. 2024 Apr 15:929:172195. doi: 10.1016/j.scitotenv.2024.172195. Online ahead of print.

Abstract

Toluene is a neurotoxic aromatic hydrocarbon and one of the major representatives of volatile organic compounds, known for its abundance, adverse health effects, and role in the formation of other atmospheric pollutants like ozone. This research introduces the enhanced version of the reptile search metaheuristics algorithm which has been utilized to tune the extreme gradient boosting hyperparameters, to investigate toluene atmospheric behavior patterns and interactions with other polluting species within defined environmental conditions. The study is based on a two-year database encompassing concentrations of inorganic gaseous contaminants every hour (NO, NO2, NOx, and O3), particulate matter fractions (PM1, PM2.5, and PM10), m,p-xylene, toluene, benzene, total non-methane hydrocarbons, and meteorological data. The experimental outcomes were validated against the results of extreme gradient boosting models optimized by seven other recent powerful metaheuristics algorithms. The best-performing model has been interpreted by employing Shapley additive explanations method. In the study, we have focused on the relationship between toluene and benzene, as its most important predictor, and provided a detailed description of environmental conditions which directed their interactions.

Keywords: Explainable AI; Metaheuristics; Swarm intelligence.