The prediction model for haze pollution based on stacking framework and feature extraction of time series images

Hui Wang; Guizhi Wang

doi:10.1016/j.scitotenv.2022.156003

The prediction model for haze pollution based on stacking framework and feature extraction of time series images

Sci Total Environ. 2022 Sep 15:839:156003. doi: 10.1016/j.scitotenv.2022.156003. Epub 2022 May 18.

Authors

Hui Wang¹, Guizhi Wang²

Affiliations

¹ Department of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, PR China.
² Department of Mathematics and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, PR China. Electronic address: wgz@nuist.edu.cn.

PMID: 35595147
DOI: 10.1016/j.scitotenv.2022.156003

Abstract

In this paper, we propose a new model called "image-feature-stacking prediction model" to study the prediction problem of univariate time series data. Its main idea is to convert univariate time series data into corresponding images, and then use the optimized Inception-v1 network to extract hidden features from the images as input variables, based on these features, a two-layer stacking ensemble learning framework is constructed to output the final predicted values. The main contribution of the newly proposed model is to convert one-dimensional time series data into two-dimensional images, and automatically extract features from images. This method can truly mine the intrinsic relationship between the data instead of simply relying on descriptive statistical features to replace the original time series, thereby improving the prediction performance of the model. We use the new prediction model to predict daily PM_2.5 concentration, for one-step prediction, the results show that compared with the other three time series prediction models, the proposed prediction model reduces the mean absolute percentage error and mean absolute scaled error to 19.204% and 1.242, respectively, which is 76.607% and 77.004% lower than the maximum value of mean absolute percentage error and mean absolute scaled error of four prediction models. We also make two-step and three-step predictions, and the newly proposed model also shows encouraging performance.

Keywords: Feature extraction; Image conversion; Prediction model; Stacking framework; Time series data.

MeSH terms

Environmental Pollution*
Time Factors