Sample self-selection using dual teacher networks for pathological image classification with noisy labels

Comput Biol Med. 2024 May:174:108489. doi: 10.1016/j.compbiomed.2024.108489. Epub 2024 Apr 16.

Abstract

Deep neural networks (DNNs) involve advanced image processing but depend on large quantities of high-quality labeled data. The presence of noisy data significantly degrades the DNN model performance. In the medical field, where model accuracy is crucial and labels for pathological images are scarce and expensive to obtain, the need to handle noisy data is even more urgent. Deep networks exhibit a memorization effect, they tend to prioritize remembering clean labels initially. Therefore, early stopping is highly effective in managing learning with noisy labels. Previous research has often concentrated on developing robust loss functions or implementing training constraints to mitigate the impact of noisy labels; however, such approaches have frequently resulted in underfitting. We propose using knowledge distillation to slow the learning process of the target network rather than preventing late-stage training from being affected by noisy labels. In this paper, we introduce a data sample self-selection strategy based on early stopping to filter out most of the noisy data. Additionally, we employ the distillation training method with dual teacher networks to ensure the steady learning of the student network. The experimental results show that our method outperforms current state-of-the-art methods for handling noisy labels on both synthetic and real-world noisy datasets. In particular, on the real-world pathological image dataset Chaoyang, the highest classification accuracy increased by 2.39 %. Our method leverages the model's predictions based on training history to select cleaner datasets and retrains them using these cleaner datasets, significantly mitigating the impact of noisy labels on model performance.

Keywords: Deep learning; Early stopping; Knowledge distillation; Noisy label learning; Pathological image classification.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Deep Learning
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Neural Networks, Computer*