Multi-Memory Convolutional Neural Network for Video Super-Resolution

IEEE Trans Image Process. 2018 Dec 17. doi: 10.1109/TIP.2018.2887017. Online ahead of print.

Abstract

Video super-resolution (SR) is focused on reconstructing high-resolution (HR) frames from consecutive lowresolution (LR) frames. Most previous video SR methods based on convolutional neural network (CNN) use a direct connection and single-memory module within the network, and they thus fail to make full use of spatio-temporal complementary information from LR observed frames. To fully exploit spatio-temporal correlations between adjacent LR frames and reveal more realistic details, this paper proposes a multi-memory convolutional neural network (MMCNN) for video SR, cascading an optical flow network and an image-reconstruction network. A serial of residual blocks engaged in utilizing intra-frame spatial correlations are proposed for feature extraction and reconstruction. Particularly, instead of using single-memory module, we embed convolutional long short-term memory (ConvLSTM) into the residual block, thus form a multi-memory residual block to progressively extract and retain inter-frame temporal correlations between consecutive LR frames. We conduct extensive experiments on numerous testing datasets with respect to different scaling factors. Our proposed MMCNN shows superiority over the state-of-the-art methods in terms of PSNR and visual quality and surpasses the best counterpart method 1 dB at most. The code and datasets are available at https://github.com/psychopa4/MMCNN.