Segmenting Objects From Relational Visual Data

IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):7885-7897. doi: 10.1109/TPAMI.2021.3115815. Epub 2022 Oct 4.

Abstract

In this article, we model a set of pixelwise object segmentation tasks - automatic video segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation (FSS) - in a unified view of segmenting objects from relational visual data. To this end, we propose an attentive graph neural network (AGNN) that addresses these tasks in a holistic fashion, by formulating them as a process of iterative information fusion over data graphs. It builds a fully-connected graph to efficiently represent visual data as nodes and relations between data instances as edges. The underlying relations are described by a differentiable attention mechanism, which thoroughly examines fine-grained semantic similarities between all the possible location pairs in two data instances. Through parametric message passing, AGNN is able to capture knowledge from the relational visual data, enabling more accurate object discovery and segmentation. Experiments show that AGNN can automatically highlight primary foreground objects from video sequences (i.e., automatic video segmentation), and extract common objects from noisy collections of semantically related images (i.e., image co-segmentation). AGNN can even generalize segment new categories with little annotated data (i.e., few-shot semantic segmentation). Taken together, our results demonstrate that AGNN provides a powerful tool that is applicable to a wide range of pixel-wise object pattern understanding tasks with relational visual data. Our algorithm implementations have been made publicly available at https://github.com/carrierlxk/AGNN.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Neural Networks, Computer*