Delving Deeper Into Pixel Prior for Box-Supervised Semantic Segmentation

IEEE Trans Image Process. 2022:31:1406-1417. doi: 10.1109/TIP.2022.3141878. Epub 2022 Jan 25.

Abstract

Weakly supervised semantic segmentation (WSSS) based on bounding box annotations has attracted considerable recent attention and has achieved promising performance. However, most of existing methods focus on generation of high-quality pseudo labels for segmented objects using box indicators, but they fail to fully explore and exploit prior from bounding box annotations, which limits performance of WSSS methods, especially for fine parts and boundaries. To overcome above issues, this paper proposes a novel Pixel-as-Instance Prior (PIP) for WSSS methods by delving deeper into pixel prior from bounding box annotations. Specifically, the proposed PIP is built on two important observations on pixels around bounding boxes. First, since objects are usually irregularity and tightly close to bounding boxes (dubbed irregular-filling prior), so each row or column of bounding boxes basically have at least one pixel belonging to foreground objects and background, respectively. Second, pixels near the bounding boxes tend to be highly ambiguous and more difficult to classify (dubbed label-ambiguity prior). To implement our PIP, a constrained loss alike multiple instance learning (MIL) and a labeling-balance loss are developed to jointly train WSSS models, which regards each pixel as a weighted positive or negative instance while considering more effective prior (i.e., irregular-filling and label-ambiguity priors) from bounding box annotations in an efficient way. Note that our PIP can be flexibly integrated with various WSSS methods, while clearly improving their performance with negligible computational overload in training stage. The experiments are conducted on most widely used PASCAL VOC 2012 and Cityscapes benchmarks, and the results show that our PIP has a good ability to improve performance of various WSSS methods, while achieving very competitive results.