Bio-Inspired Representation Learning for Visual Attention Prediction

IEEE Trans Cybern. 2021 Jul;51(7):3562-3575. doi: 10.1109/TCYB.2019.2931735. Epub 2021 Jun 23.

Abstract

Visual attention prediction (VAP) is a significant and imperative issue in the field of computer vision. Most of the existing VAP methods are based on deep learning. However, they do not fully take advantage of the low-level contrast features while generating the visual attention map. In this article, a novel VAP method is proposed to generate the visual attention map via bio-inspired representation learning. The bio-inspired representation learning combines both low-level contrast and high-level semantic features simultaneously, which are developed by the fact that the human eye is sensitive to the patches with high contrast and objects with high semantics. The proposed method is composed of three main steps: 1) feature extraction; 2) bio-inspired representation learning; and 3) visual attention map generation. First, the high-level semantic feature is extracted from the refined VGG16, while the low-level contrast feature is extracted by the proposed contrast feature extraction block in a deep network. Second, during bio-inspired representation learning, both the extracted low-level contrast and high-level semantic features are combined by the designed densely connected block, which is proposed to concatenate various features scale by scale. Finally, the weighted-fusion layer is exploited to generate the ultimate visual attention map based on the obtained representations after bio-inspired representation learning. Extensive experiments are performed to demonstrate the effectiveness of the proposed method.