State-of-the-art image and video quality assessment with a metric based on an intrinsically non-linear neural summation model

Front Neurosci. 2023 Jul 25:17:1222815. doi: 10.3389/fnins.2023.1222815. eCollection 2023.

Abstract

The development of automatic methods for image and video quality assessment that correlate well with the perception of human observers is a very challenging open problem in vision science, with numerous practical applications in disciplines such as image processing and computer vision, as well as in the media industry. In the past two decades, the goal of image quality research has been to improve upon classical metrics by developing models that emulate some aspects of the visual system, and while the progress has been considerable, state-of-the-art quality assessment methods still share a number of shortcomings, like their performance dropping considerably when they are tested on a database that is quite different from the one used to train them, or their significant limitations in predicting observer scores for high framerate videos. In this work we propose a novel objective method for image and video quality assessment that is based on the recently introduced Intrinsically Non-linear Receptive Field (INRF) formulation, a neural summation model that has been shown to be better at predicting neural activity and visual perception phenomena than the classical linear receptive field. Here we start by optimizing, on a classic image quality database, the four parameters of a very simple INRF-based metric, and proceed to test this metric on three other databases, showing that its performance equals or surpasses that of the state-of-the-art methods, some of them having millions of parameters. Next, we extend to the temporal domain this INRF image quality metric, and test it on several popular video quality datasets; again, the results of our proposed INRF-based video quality metric are shown to be very competitive.

Keywords: INRF; computational modeling; high frame rate videos; image quality assessment; receptive field; video quality assessment; visual neuroscience; visual perception.

Grants and funding

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 952027 (project AdMiRe) and the Spanish Ministry of Science, grant reference PID2021-127373NB-I00. RL was supported by a Juan de la Cierva-Formación fellowship (FJC2020-044084-I) funded by MCIN/AEI /10.13039/501100011033 and by the European Union NextGenerationEU/PRTR.