Saliency as Pseudo-Pixel Supervision for Weakly and Semi-Supervised Semantic Segmentation

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12341-12357. doi: 10.1109/TPAMI.2023.3273592. Epub 2023 Sep 5.

Abstract

Existing studies on semantic segmentation using image-level weak supervision have several limitations, including sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, an improved version of Explicit Pseudo-pixel Supervision (EPS++), which learns from pixel-level feedback by combining two types of weak supervision. Specifically, the image-level label provides the object identity via the localization map, and the saliency map from an off-the-shelf saliency detection model offers rich object boundaries. We devise a joint training strategy to fully utilize the complementary relationship between disparate information. Notably, we suggest an Inconsistent Region Drop (IRD) strategy, which effectively handles errors in saliency maps using fewer hyper-parameters than EPS. Our method can obtain accurate object boundaries and discard co-occurring pixels, significantly improving the quality of pseudo-masks. Experimental results show that EPS++ effectively resolves the key challenges of semantic segmentation using weak supervision, resulting in new state-of-the-art performances on three benchmark datasets in a weakly supervised semantic segmentation setting. Furthermore, we show that the proposed method can be extended to solve the semi-supervised semantic segmentation problem using image-level weak supervision. Surprisingly, the proposed model also achieves new state-of-the-art performances on two popular benchmark datasets.