Predicting particulate matter, nitrogen dioxide, and ozone across Great Britain with high spatiotemporal resolution based on random forest models

Sci Total Environ. 2024 May 20:926:171831. doi: 10.1016/j.scitotenv.2024.171831. Epub 2024 Mar 21.

Abstract

In Great Britain, limited studies have employed machine learning methods to predict air pollution especially ozone (O3) with high spatiotemporal resolution. This study aimed to address this gap by developing random forest models for four key pollutants (fine and inhalable particulate matter [PM2.5 and PM10], nitrogen dioxide [NO2] and O3) by integrating multiple-source predictors at a daily level and 1-km resolution. The out-of-bag R2 (root mean squared error, RMSE) between predictions from models and measurements from monitoring stations in 2006-2013 was 0.85 (3.63 μg/m3) for PM2.5, 0.77 (6.00 μg/m3) for PM10, 0.85 (9.71 μg/m3) for NO2, and 0.85 (9.39 μg/m3) for maximum daily 8-h average (MDA8) O3 at daily level, and the predicting accuracy was higher at monthly and annual level. The high-resolution predictions captured characterized spatiotemporal patterns of the four pollutants. Higher concentrations of PM2.5, PM10, and NO2 were distributed in densely populated southern regions of Great Britain while O3 showed an inverse spatial pattern in general, which could not be fully depicted by monitoring stations. Therefore, predictions produced in this study could improve exposure assessment with less exposure misclassification and flexible exposure windows for future epidemiological studies to investigate the impact of air pollution across Great Britain.

Keywords: Machine learning; Nitrogen dioxide; Ozone; Particulate matter; Random forest.