Using Harris hawk optimization towards support vector regression to ozone prediction

Stoch Environ Res Risk Assess. 2022;36(2):429-449. doi: 10.1007/s00477-022-02178-2. Epub 2022 Jan 30.

Abstract

As an area experiencing air pollution, especially ozone concentrations that often exceed the threshold or are unhealthy, JABODETABEK (Jakarta, Bogor, Depok, Tangerang, and Bekasi) seeks to prevent and control pollution as well as restore air quality. Therefore, this study aims to build a predictive model of ozone concentration using Harris hawks optimization-support vector regression (HHO-SVR) in 14 sub-districts in JABODETABEK. This goal is achieved by collecting data on ozone concentration as a response variable and meteorological factors as predictor variables from the website that provides the data. Other predictor variables such as time and significant lag detected with partial autocorrelation function of ozone concentration were also used. Then the variables will be selected using the recursive feature elimination-support vector regression (RFE-SVR) to obtain a significant predictor variable that affects the ozone concentration. After that, the prediction model will be built using the HHO-SVR method, support vector regression (SVR) whose parameter values are optimized with the Harris hawks optimization (HHO) algorithm. When the model has been formed, several evaluation metrics used to determine the best model include mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), Coefficient of Determination (R2), Variance Ratio (VR), and Diebold-Mariano test. The results of this study indicate that lag 1, lag 2, air temperature, humidity, and UV index are significant predictor variables of the RFE-SVR results for most sub-districts. In general, the HHO process takes longer than other metaheuristic algorithms. On average, 7 of the 14 sub-districts using the HHO-SVR model yielded the best predictions with MAE below 10, RMSE and MAPE below 20, R2 around 0.97, and VR around 0.98. Then, the results of the Diebold-Mariano test also show that the accuracy of the prediction results and the stability of the performance of the HHO-SVR model is better, especially for the Ciputat and South Bekasi sub-districts. This shows that the two sub-districts are very suitable to use HHO-SVR in predicting ozone concentrations.

Keywords: HHO; JABODETABEK; Ozone; RFE; SVR.