Bayesian uncertainty estimation for detection of long-tailed and unseen conditions in medical images

J Med Imaging (Bellingham). 2023 Sep;10(5):054501. doi: 10.1117/1.JMI.10.5.054501. Epub 2023 Oct 9.

Abstract

Purpose: Deep supervised learning provides an effective approach for developing robust models for various computer-aided diagnosis tasks. However, there is often an underlying assumption that the frequencies of the samples between the different classes of the training dataset are either similar or balanced. In real-world medical data, the samples of positive classes often occur too infrequently to satisfy this assumption. Thus, there is an unmet need for deep-learning systems that can automatically identify and adapt to the real-world conditions of imbalanced data.

Approach: We propose a deep Bayesian ensemble learning framework to address the representation learning problem of long-tailed and out-of-distribution (OOD) samples when training from medical images. By estimating the relative uncertainties of the input data, our framework can adapt to imbalanced data for learning generalizable classifiers. We trained and tested our framework on four public medical imaging datasets with various imbalance ratios and imaging modalities across three different learning tasks: semantic medical image segmentation, OOD detection, and in-domain generalization. We compared the performance of our framework with those of state-of-the-art comparator methods.

Results: Our proposed framework outperformed the comparator models significantly across all performance metrics (pairwise t-test: p<0.01) in the semantic segmentation of high-resolution CT and MR images as well as in the detection of OOD samples (p<0.01), thereby showing significant improvement in handling the associated long-tailed data distribution. The results of the in-domain generalization also indicated that our framework can enhance the prediction of retinal glaucoma, contributing to clinical decision-making processes.

Conclusions: Training of the proposed deep Bayesian ensemble learning framework with dynamic Monte-Carlo dropout and a combination of losses yielded the best generalization to unseen samples from imbalanced medical imaging datasets across different learning tasks.

Keywords: deep Bayesian modeling; long-tailed distribution learning; out-of-distribution learning; uncertainty estimation.