Prophet forecasting model: a machine learning approach to predict the concentration of air pollutants (PM2.5, PM10, O3, NO2, SO2, CO) in Seoul, South Korea

PeerJ. 2020 Sep 15:8:e9961. doi: 10.7717/peerj.9961. eCollection 2020.

Abstract

Amidst recent industrialization in South Korea, Seoul has experienced high levels of air pollution, an issue that is magnified due to a lack of effective air pollution prediction techniques. In this study, the Prophet forecasting model (PFM) was used to predict both short-term and long-term air pollution in Seoul. The air pollutants forecasted in this study were PM2.5, PM10, O3, NO2, SO2, and CO, air pollutants responsible for numerous health conditions upon long-term exposure. Current chemical models to predict air pollution require complex source lists making them difficult to use. Machine learning models have also been implemented however their requirement of meteorological parameters render the models ineffective as additional models and infrastructure need to be in place to model meteorology. To address this, a model needs to be created that can accurately predict pollution based on time. A dataset containing three years worth of hourly air quality measurements in Seoul was sourced from the Seoul Open Data Plaza. To optimize the model, PFM has the following parameters: model type, changepoints, seasonality, holidays, and error. Cross validation was performed on the 2017-18 data; then, the model predicted 2019 values. To compare the predicted and actual values and determine the accuracy of the model, the statistical indicators: mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), and coverage were used. PFM predicted PM2.5 and PM10 with a MAE value of 12.6 µg/m3 and 19.6 µg/m3, respectively. PFM also predicted SO2 and CO with a MAE value of 0.00124 ppm and 0.207 ppm, respectively. PFM's prediction of PM2.5 and PM10 had a MAE approximately 2 times and 4 times less, respectively, than comparable models. PFM's prediction of SO2and CO had a MAE approximately five times and 50 times less, respectively, than comparable models. In most cases, PFM's ability to accurately forecast the concentration of air pollutants in Seoul up to one year in advance outperformed similar models proposed in literature. This study addresses the limitations of the prior two PFM studies by expanding the modelled air pollutants from three pollutants to six pollutants while increasing the prediction time from 3 days to 1 year. This is also the first research to use PFM in Seoul, Korea. To achieve more accurate results, a larger air pollution dataset needs to be implemented with PFM. In the future, PFM should be used to predict and model air pollution in other regions, especially those without advanced infrastructure to model meteorology alongside air pollution. In Seoul, Seoul's government can use PFM to accurately predict air pollution concentrations and plan accordingly.

Keywords: Air pollution; Carbon monoxide; Nitrogen dioxide; Particulate matter; Prediction model; Prophet forecasting model; Seoul; South Korea; Sulfur dioxide; Tropospheric ozone.

Associated data

  • figshare/10.6084/m9.figshare.12805643.v2

Grants and funding

The authors received no funding for this work.