A Variational Bayesian Deep Network with Data Self-Screening Layer for Massive Time-Series Data Forecasting

Entropy (Basel). 2022 Feb 25;24(3):335. doi: 10.3390/e24030335.

Abstract

Compared with mechanism-based modeling methods, data-driven modeling based on big data has become a popular research field in recent years because of its applicability. However, it is not always better to have more data when building a forecasting model in practical areas. Due to the noise and conflict, redundancy, and inconsistency of big time-series data, the forecasting accuracy may reduce on the contrary. This paper proposes a deep network by selecting and understanding data to improve performance. Firstly, a data self-screening layer (DSSL) with a maximal information distance coefficient (MIDC) is designed to filter input data with high correlation and low redundancy; then, a variational Bayesian gated recurrent unit (VBGRU) is used to improve the anti-noise ability and robustness of the model. Beijing's air quality and meteorological data are conducted in a verification experiment of 24 h PM2.5 concentration forecasting, proving that the proposed model is superior to other models in accuracy.

Keywords: data self-screening layer; gated recurrent unit; maximal information distance coefficient; time-series data forecast; variational inference.