Exploring Long- and Short-Range Temporal Information for Learned Video Compression

IEEE Trans Image Process. 2024:33:780-792. doi: 10.1109/TIP.2024.3349859. Epub 2024 Jan 15.

Abstract

Learned video compression methods have gained various interests in the video coding community. Most existing algorithms focus on exploring short-range temporal information and developing strong motion compensation. Still, the ignorance of long-range temporal information utilization constrains the potential of compression. In this paper, we are dedicated to exploiting both long- and short-range temporal information to enhance video compression performance. Specifically, for long-range temporal information exploration, we propose a temporal prior that can be continuously supplemented and updated during compression within the group of pictures (GOP). With the updating scheme, the temporal prior can provide richer mutual information between the overall prior and the current frame for the entropy model, thus facilitating Gaussian parameter prediction. As for the short-range temporal information, we propose a progressive guided motion compensation to achieve robust and accurate compensation. In particular, we design a hierarchical structure to build multi-scale compensation, and by employing optical flow guidance, we generate pixel offsets as motion information at each scale. Additionally, the compensation results at each scale will guide the next scale's compensation, forming a flow-to-kernel and scale-by-scale stable guiding strategy. Extensive experimental results demonstrate that our method can obtain advanced rate-distortion performance compared to the state-of-the-art learned video compression approaches and the latest standard reference software in terms of PSNR and MS-SSIM. The codes are publicly available on: https://github.com/Huairui/LSTVC.