Encoder-Decoder Full Residual Deep Networks for Robust Regression and Spatiotemporal Estimation

IEEE Trans Neural Netw Learn Syst. 2021 Sep;32(9):4217-4230. doi: 10.1109/TNNLS.2020.3017200. Epub 2021 Aug 31.

Abstract

Although increasing hidden layers can improve the ability of a neural network in modeling complex nonlinear relationships, deep layers may result in degradation of accuracy due to the problem of vanishing gradient. Accuracy degradation limits the applications of deep neural networks to predict continuous variables with a small sample size and/or weak or little invariance to translations. Inspired by residual convolutional neural network in computer vision, we developed an encoder-decoder full residual deep network to robustly regress and predict complex spatiotemporal variables. We embedded full shortcuts from each encoding layer to its corresponding decoding layer in a systematic encoder-decoder architecture for efficient residual mapping and error signal propagation. We demonstrated, theoretically and experimentally, that the proposed network structure with full residual connections can successfully boost the backpropagation of signals and improve learning outcomes. This novel method has been extensively evaluated and compared with four commonly used methods (i.e., plain neural network, cascaded residual autoencoder, generalized additive model, and XGBoost) across different testing cases for continuous variable predictions. For model evaluation, we focused on spatiotemporal imputation of satellite aerosol optical depth with massive nonrandomness missingness and spatiotemporal estimation of atmospheric fine particulate matter [Formula: see text] (PM2.5). Compared with the other approaches, our method achieved the state-of-the-art accuracy, had less bias in predicting extreme values, and generated more realistic spatial surfaces. This encoder-decoder full residual deep network can be an efficient and powerful tool in a variety of applications that involve complex nonlinear relationships of continuous variables, varying sample sizes, and spatiotemporal data with weak or little invariance to translation.