Adaptive Feature Learning for Unbiased Scene Graph Generation

IEEE Trans Image Process. 2024:33:2252-2265. doi: 10.1109/TIP.2024.3374644. Epub 2024 Mar 25.

Abstract

Scene Graph Generation (SGG) aims to detect all objects and identify their pairwise relationships in the scene. Recently, tremendous progress has been made in exploring better context relationship representations. Previous work mainly focuses on contextual information aggregation and uses de-biasing strategies on samples to eliminate the preference for head predicates. However, there remain challenges caused by indeterminate feature training. Overlooking the label confusion problem in feature training easily results in a messy feature distribution among the confused categories, thereby affecting the prediction of predicates. To alleviate the aforementioned problem, in this paper, we focus on enhancing predicate representation learning. Firstly, we propose a novel Adaptive Message Passing (AMP) network to dynamically conduct information propagation among neighbors. AMP provides discriminating representations for neighbor nodes under the view of de-noising and adaptive aggregation. Furthermore, we construct a feature-assisted training paradigm alongside the predicate classification branch, guiding predicate feature learning to the corresponding feature space. Moreover, to alleviate biased prediction caused by the long-tailed class distribution and the interference of confused labels, we design a Bi-level Curriculum learning scheme (BiC). The BiC separately considers the training from the feature learning and de-biasing levels, preserving discriminating representations of different predicates while resisting biased predictions. Results on multiple SGG datasets show that our proposed method AMP-BiC has superior comprehensive performance, demonstrating its effectiveness.