TSA-SCC: Text Semantic-Aware Screen Content Coding With Ultra Low Bitrate

IEEE Trans Image Process. 2022:31:2463-2477. doi: 10.1109/TIP.2022.3152003. Epub 2022 Mar 18.

Abstract

Due to the rapid growth of web conferences, remote screen sharing, and online games, screen content has become an important type of internet media information and over 90% of online media interactions are screen based. Meanwhile, as the main component in the screen content, textual information averagely takes up over 40% of the whole image on various commonly used screen content datasets. However, it is difficult to compress the textual information by using the traditional coding schemes as HEVC, which assumes strong spatial and temporal correlations within the image/video. State-of-the-art screen content coding (SCC) standard as HEVC-SCC still adopts a block-based coding framework and does not consider the text semantics for compression, thus inevitably blurring texts at a lower bitrate. In this paper, we propose a general text semantic-aware screen content coding scheme (TSA-SCC) for ultra low bitrate setting. This method detects the abrupt picture in a screen content video (or image), recognizes textual information (including word, position, font type, font size and font color) in the abrupt picture based on neural networks, and encodes texts with text coding tools. The other pictures as well as the background image after removing texts from the abrupt picture via inpainting, are encoded with HEVC-SCC. Compared with HEVC-SCC, the proposed method TSA-SCC reduces bitrate by up to 3× at a similar compression quality. Moreover, TSA-SCC achieves much better visual quality with less bitrate consumption when encoding the screen content video/image at ultra low bitrates.