Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference

Wenhao Wang; Zhenbing Liu; Haoxiang Lu; Rushi Lan; Zhaoyuan Zhang

doi:10.3390/s23187880

Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference

Sensors (Basel). 2023 Sep 14;23(18):7880. doi: 10.3390/s23187880.

Authors

Wenhao Wang¹, Zhenbing Liu¹, Haoxiang Lu¹, Rushi Lan¹, Zhaoyuan Zhang¹

Affiliation

¹ School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China.

Abstract

Video super-resolution aims to generate high-resolution frames from low-resolution counterparts. It can be regarded as a specialized application of image super-resolution, serving various purposes, such as video display and surveillance. This paper proposes a novel method for real-time video super-resolution. It effectively exploits spatial information by utilizing the capabilities of an image super-resolution model and leverages the temporal information inherent in videos. Specifically, the method incorporates a pre-trained image super-resolution network as its foundational framework, allowing it to leverage existing expertise for super-resolution. A fast temporal information aggregation module is presented to further aggregate temporal cues across frames. By using deformable convolution to align features of neighboring frames, this module takes advantage of inter-frame dependency. In addition, it employs a hierarchical fast spatial offset feature extraction and a channel attention-based temporal fusion. A redundancy-aware inference algorithm is developed to reduce computational redundancy by reusing intermediate features, achieving real-time inferring speed. Extensive experiments on several benchmarks demonstrate that the proposed method can reconstruct satisfactory results with strong quantitative performance and visual qualities. The real-time inferring ability makes it suitable for real-world deployment.

Keywords: deep learning; deformable convolution; redundancy-aware inference; temporal aggregation; video super-resolution.

Abstract

Grants and funding