Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning

Gencer Sumbul; Begum Demir

doi:10.1109/TIP.2023.3293776

Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning

IEEE Trans Image Process. 2023:32:4529-4542. doi: 10.1109/TIP.2023.3293776. Epub 2023 Aug 10.

Authors

Gencer Sumbul, Begum Demir

PMID: 37440393
DOI: 10.1109/TIP.2023.3293776

Abstract

The development of deep learning based image representation learning (IRL) methods has attracted great attention for various image understanding problems. Most of these methods require the availability of a set of high quantity and quality of annotated training images, which can be time-consuming, complex and costly to gather. To reduce labeling costs, crowdsourced data, automatic labeling procedures or citizen science projects can be considered. However, such approaches increase the risk of including label noise in training data. It may result in overfitting on noisy labels when discriminative reasoning is employed as in most of the existing methods. This leads to sub-optimal learning procedures, and thus inaccurate characterization of images. To address this issue, in this paper, we introduce a generative reasoning integrated label noise robust deep representation learning (GRID) approach. The proposed GRID approach aims to model the complementary characteristics of discriminative and generative reasoning for IRL under noisy labels. To this end, we first integrate generative reasoning into discriminative reasoning through a supervised variational autoencoder. This allows the proposed GRID approach to automatically detect training samples with noisy labels. Then, through our label noise robust hybrid representation learning strategy, GRID adjusts the whole learning procedure for IRL of these samples through generative reasoning and that of the other samples through discriminative reasoning. Our approach learns discriminative image representations while preventing interference of noisy labels during training independently from the IRL method being selected. Thus, unlike the existing label noise robust methods, GRID does not depend on the type of annotation, label noise, neural network architecture, loss function or learning task, and thus can be directly utilized for various image understanding problems. Experimental results show the effectiveness of the proposed GRID approach compared to the state-of-the-art methods. The code of the proposed approach is publicly available at https://github.com/gencersumbul/GRID.