Position Fusing and Refining for Clear Salient Object Detection

IEEE Trans Neural Netw Learn Syst. 2022 Nov 4:PP. doi: 10.1109/TNNLS.2022.3213557. Online ahead of print.

Abstract

Multilevel feature fusion plays a pivotal role in salient object detection (SOD). High-level features present rich semantic information but lack object position information, whereas low-level features contain object position information but are mixed with noises such as backgrounds. Appropriately addressing the gap between low-and high-level features is important in SOD. We first propose a global position embedding attention (GPEA) module to minimize the discrepancy between multilevel features in this article. We extract the position information by utilizing the semantic information at high-level features to resist noises at low-level features. Object refine attention (ORA) module is introduced to refine features used to predict saliency maps further without any additional supervision and heighten discriminative regions near the salient object, such as boundaries. Moreover, we find that the saliency maps generated by the previous methods contain some blurry regions, and we design a pixel value (PV) loss to help the model generate saliency maps with improved clarity. Experimental results on five commonly used SOD datasets demonstrated that the proposed method is effective and outperforms the state-of-the-art approaches on multiple metrics.