Meta-VOS: Learning to Adapt Online Target-Specific Segmentation

IEEE Trans Image Process. 2021:30:4760-4772. doi: 10.1109/TIP.2021.3075086. Epub 2021 May 5.

Abstract

The task of video object segmentation is a fundamental but challenging problem in the field of computer vision. To deal with large variations in target objects and background clutter, we propose an online adaptive video object segmentation (VOS) framework, named Meta-VOS, that learns to adapt the target-specific segmentation. Meta-VOS builds an online adaptive learning process by exploiting cumulative expertise after searching for confidence patterns across different videos/frames, and then dynamically improves the model learning from two aspects: Meta-seg learner (i.e., module updating) and Meta-seg criterion (i.e., rule of expertise). As our goal is to rapidly determine which patterns best represent the essential characteristics of specific targets in a video, Meta-seg learner is introduced to adaptively learn to update the parameters and hyperparameters of segmentation network in very few gradient descent steps. Furthermore, a Meta-seg criterion of learned expertise, which is constructed to evaluate the Meta-seg learner for the online adaptation of the segmentation network, can confidently online update positive/negative patterns under the guidance of motion cues, object appearances and learned knowledge. Comprehensive evaluations on several benchmark datasets demonstrate the superiority of our proposed Meta-VOS when compared with other state-of-the-art methods applied to the VOS problem.