Deep ensemble learning enables highly accurate classification of stored red blood cell morphology

Sci Rep. 2023 Feb 23;13(1):3152. doi: 10.1038/s41598-023-30214-w.

Abstract

Changes in red blood cell (RBC) morphology distribution have emerged as a quantitative biomarker for the degradation of RBC functional properties during hypothermic storage. Previously published automated methods for classifying the morphology of stored RBCs often had insufficient accuracy and relied on proprietary code and datasets, making them difficult to use in many research and clinical applications. Here we describe the development and validation of a highly accurate open-source RBC morphology classification pipeline based on ensemble deep learning (DL). The DL-enabled pipeline utilized adaptive thresholding or semantic segmentation for RBC identification, a deep ensemble of four convolutional neural networks (CNNs) to classify RBC morphology, and Kalman filtering with Hungarian assignment for tracking changes in the morphology of individual RBCs over time. The ensembled CNNs were trained and evaluated on thousands of individual RBCs from two open-access datasets previously collected to quantify the morphological heterogeneity and washing-induced shape recovery of stored RBCs. Confusion matrices and reliability diagrams demonstrated under-confidence of the constituent models and an accuracy of about 98% for the deep ensemble. Such a high accuracy allowed the CNN ensemble to uncover new insights over our previously published studies. Re-analysis of the datasets yielded much more accurate distributions of the effective diameters of stored RBCs at each stage of morphological degradation (discocyte: 7.821 ± 0.429 µm, echinocyte 1: 7.800 ± 0.581 µm, echinocyte 2: 7.304 ± 0.567 µm, echinocyte 3: 6.433 ± 0.490 µm, sphero-echinocyte: 5.963 ± 0.348 µm, spherocyte: 5.904 ± 0.292 µm, stomatocyte: 7.080 ± 0.522 µm). The effective diameter distributions were significantly different across all morphologies, with considerable effect sizes for non-neighboring classes. A combination of morphology classification with cell tracking enabled the discovery of a relatively rare and previously overlooked shape recovery of some sphero-echinocytes to early-stage echinocytes after washing with 1% human serum albumin solution. Finally, the datasets and code have been made freely available online to enable replication, further improvement, and adaptation of our work for other applications.

MeSH terms

  • Erythrocytes*
  • Erythrocytes, Abnormal*
  • Hematologic Tests
  • Humans
  • Machine Learning
  • Reproducibility of Results