An uncertainty-aware self-training framework with consistency regularization for the multilabel classification of common computed tomography signs in lung nodules

Quant Imaging Med Surg. 2023 Sep 1;13(9):5536-5554. doi: 10.21037/qims-23-40. Epub 2023 Jul 3.

Abstract

Background: Computed tomography (CT) signs of lung nodules play an important role in indicating lung nodules' malignancy, and accurate automatic classification of these signs can help doctors improve their diagnostic efficiency. However, few relevant studies targeting multilabel classification (MLC) of nodule signs have been conducted. Moreover, difficulty in obtaining labeled data also restricts this avenue of research to a large extent. To address these problems, a multilabel automatic classification system for nodule signs is proposed, which consists of a 3-dimensional (3D) convolutional neural network (CNN) and an efficient new semi-supervised learning (SSL) framework.

Methods: Two datasets were used in our experiments: Lung Nodule Analysis 16 (LUNA16), a public dataset for lung nodule classification, and a private dataset of nodule signs. The private dataset contains 641 nodules, 454 of which were annotated with 6 important signs by radiologists. Our classification system consists of 2 main parts: a 3D CNN model and an SSL method called uncertainty-aware self-training framework with consistency regularization (USC). In the system, supervised training is performed with labeled data, and simultaneously, an uncertainty-and-confidence-based strategy is used to select pseudo-labeled samples for unsupervised training, thus jointly realizing the optimization of the model.

Results: For the MLC of nodule signs, our proposed 3D CNN achieved satisfactory results with a mean average precision (mAP) of 0.870 and a mean area under the curve (AUC) of 0.782. In semi-supervised experiments, compared with supervised learning, our proposed SSL method could increase the mAP by 7.6% (from 0.730 to 0.806) and the mean AUC by 8.1% (from 0.631 to 0.712); it thus efficiently utilized the unlabeled data and achieved superior performance improvement compared to the recently advanced methods.

Conclusions: We realized the optimal MLC of lung nodule signs with our proposed 3D CNN. Our proposed SSL method can also offer an efficient solution for the insufficiency of labeled data that may exist in the MLC tasks of 3D medical images.

Keywords: 3D convolutional neural networks; Computed tomography signs; computed tomography (CT); multilabel classification (MLC); semi-supervised learning (SSL).