Learning Methodologies to Generate Kernel-Learning-Based Image Downscaler for Arbitrary Scaling Factors

Sung In Cho; Suk-Ju Kang

doi:10.1109/TIP.2021.3073316

Learning Methodologies to Generate Kernel-Learning-Based Image Downscaler for Arbitrary Scaling Factors

IEEE Trans Image Process. 2021:30:4526-4539. doi: 10.1109/TIP.2021.3073316. Epub 2021 Apr 27.

Authors

Sung In Cho, Suk-Ju Kang

PMID: 33877975
DOI: 10.1109/TIP.2021.3073316

Abstract

Displays and content have various resolutions and aspect ratios, requiring an image downscaler to adaptively reduce the image resolution. However, research on downscaling has garnered less attention than upscaling, including super-resolution. In practical display systems, simple interpolation, such as a bicubic filter that cannot preserve image details well, is still widely used for image downscaling rather than frame optimization-based or learning-based methods because of following reasons: frame optimization-based methods can effectively preserve image details after downscaling but are difficult to implement due to hardware costs. Learning-based methods have not been developed because defining a target downscaled image for training is difficult and training all downscaling factors is impossible. We propose a novel kernel-learning-based image downscaler to improve detail-preservation quality while supporting arbitrary downscaling factors using simple linear mapping. For this, a method to produce the ideal target downscaling result considering aliasing artifacts and detail preservation after downscaling is proposed. Then, we propose a training technique using the positional relationship between input and output pixels and a hierarchical region analysis to reproduce target images through simple kernel-based linear mapping. Lastly, a kernel-sharing technique is proposed to generate downscaling results for downscaling factors using a minimum number of trained kernels. In the simulation results, the proposed method demonstrated excellent edge preservation by improving the recall, precision, and F1 score, measuring the edge consistency between input and downscaled images, by up to 0.141, 0.079, 0.053, respectively, compared to benchmark methods. In a paired-comparison-based user study, the proposed method obtained the highest preference among benchmark methods using simple operations.