A densely connected causal convolutional network separating past and future data for filling missing PM2.5 time series data

Heliyon. 2024 Jan 17;10(2):e24738. doi: 10.1016/j.heliyon.2024.e24738. eCollection 2024 Jan 30.

Abstract

Air pollution poses a significant threat to human health and the environment globally. Precise analysis and prediction of pollutant concentrations are essential for monitoring and managing air quality. However, reliable analysis and prediction require comprehensive and high-quality data, which is often compromised due to missing data during collection. Unfortunately, conventional methods for addressing missing data fall short of providing adequate solutions. The missing data for air quality indicators are commonly systematic, with all data points missing for extended periods. This makes it difficult to establish correlations and populate the missing data accurately. To address this problem, we propose a Densely Connected Causal Convolutional Network Separating Past and Future Data (DCCN-SPF), a deep learning-based model that fills in continuous missing PM2.5 concentration data in the original dataset. It extracts features from past and future data separately using densely connected causal convolutional networks and incorporates linear interpolation and deep learning structures to improve prediction accuracy. Using air quality monitoring data from the China Environmental Monitoring Station between 2017 and 2021 in Beijing, we compare our proposed model with baseline models and find that our model outperforms others in predicting PM2.5 concentrations. The evaluation metrics MAE and RMSE are used, revealing significant reductions of 8.7-21.6 % for MAE and 7.1-23.5 % for RMSE in favor of our proposed DCCN-SPF model.

Keywords: Air quality; Deep learning; Densely connected causal convolutional network; Missing data filling; PM2.5.