Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling

Xianguo Zhang; Yonghong Tian; Tiejun Huang; Siwei Dong; Wen Gao

doi:10.1109/TIP.2014.2352036

Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling

IEEE Trans Image Process. 2014 Oct;23(10):4511-26. doi: 10.1109/TIP.2014.2352036. Epub 2014 Aug 26.

Authors

Xianguo Zhang, Yonghong Tian, Tiejun Huang, Siwei Dong, Wen Gao

PMID: 25167551
DOI: 10.1109/TIP.2014.2352036

Abstract

For the real-time and low-delay video surveillance and teleconferencing applications, the newly video coding standard HEVC can achieve much higher coding efficiency over H.264/AVC. However, we still argue that the hierarchical prediction structure in the HEVC low-delay encoder still does not fully utilize the special characteristics of surveillance and conference videos that are usually captured by stationary cameras. In this case, the background picture (G-picture), which is modeled from the original input frames, can be used to further improve the HEVC low-delay coding efficiency meanwhile reducing the complexity. Therefore, we propose an optimization method for the hierarchical prediction and coding in HEVC for these videos with background modeling. First, several experimental and theoretical analyses are conducted on how to utilize the G-picture to optimize the hierarchical prediction structure and hierarchical quantization. Following these results, we propose to encode the G-picture as the long-term reference frame to improve the background prediction, and then present a G-picture-based bit-allocation algorithm to increase the coding efficiency. Meanwhile, according to the proportions of background and foreground pixels in coding units (CUs), an adaptive speed-up algorithm is developed to classify each CU into different categories and then adopt different speed-up strategies to reduce the encoding complexity. To evaluate the performance, extensive experiments are performed on the HEVC test model. Results show our method can averagely save 39.09% bits and reduce the encoding complexity by 43.63% on surveillance videos, whereas those are 5.27% and 43.68% on conference videos.

Publication types

Research Support, Non-U.S. Gov't