Childhood Leukemia Classification via Information Bottleneck Enhanced Hierarchical Multi-Instance Learning

IEEE Trans Med Imaging. 2023 Aug;42(8):2348-2359. doi: 10.1109/TMI.2023.3248559. Epub 2023 Aug 1.

Abstract

Leukemia classification relies on a detailed cytomorphological examination of Bone Marrow (BM) smear. However, applying existing deep-learning methods to it is facing two significant limitations. Firstly, these methods require large-scale datasets with expert annotations at the cell level for good results and typically suffer from poor generalization. Secondly, they simply treat the BM cytomorphological examination as a multi-class cell classification task, thus failing to exploit the correlation among leukemia subtypes over different hierarchies. Therefore, BM cytomorphological estimation as a time-consuming and repetitive process still needs to be done manually by experienced cytologists. Recently, Multi-Instance Learning (MIL) has achieved much progress in data-efficient medical image processing, which only requires patient-level labels (which can be extracted from the clinical reports). In this paper, we propose a hierarchical MIL framework and equip it with Information Bottleneck (IB) to tackle the above limitations. First, to handle the patient-level label, our hierarchical MIL framework uses attention-based learning to identify cells with high diagnostic values for leukemia classification in different hierarchies. Then, following the information bottleneck principle, we propose a hierarchical IB to constrain and refine the representations of different hierarchies for better accuracy and generalization. By applying our framework to a large-scale childhood acute leukemia dataset with corresponding BM smear images and clinical reports, we show that it can identify diagnostic-related cells without the need for cell-level annotations and outperforms other comparison methods. Furthermore, the evaluation conducted on an independent test cohort demonstrates the high generalizability of our framework.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child
  • Deep Learning*
  • Humans
  • Image Processing, Computer-Assisted
  • Leukemia* / diagnostic imaging
  • Machine Learning