Data augmentation with improved regularisation and sampling for imbalanced blood cell image classification

Sci Rep. 2022 Oct 27;12(1):18101. doi: 10.1038/s41598-022-22882-x.

Abstract

Due to progression in cell-cycle or duration of storage, classification of morphological changes in human blood cells is important for correct and effective clinical decisions. Automated classification systems help avoid subjective outcomes and are more efficient. Deep learning and more specifically Convolutional Neural Networks have achieved state-of-the-art performance on various biomedical image classification problems. However, real-world data often suffers from the data imbalance problem, owing to which the trained classifier is biased towards the majority classes and does not perform well on the minority classes. This study presents an imbalanced blood cells classification method that utilises Wasserstein divergence GAN, mixup and novel nonlinear mixup for data augmentation to achieve oversampling of the minority classes. We also present a minority class focussed sampling strategy, which allows effective representation of minority class samples produced by all three data augmentation techniques and contributes to the classification performance. The method was evaluated on two publicly available datasets of immortalised human T-lymphocyte cells and Red Blood Cells. Classification performance evaluated using F1-score shows that our proposed approach outperforms existing methods on the same datasets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Blood Cells*
  • Humans
  • Neural Networks, Computer*