An Anthropomorphic Diagnosis System of Pulmonary Nodules using Weak Annotation-Based Deep Learning

medRxiv [Preprint]. 2024 May 5:2024.05.03.24306828. doi: 10.1101/2024.05.03.24306828.

Abstract

Purpose: To develop an anthropomorphic diagnosis system of pulmonary nodules (PN) based on Deep learning (DL) that is trained by weak annotation data and has comparable performance to full-annotation based diagnosis systems.

Methods: The proposed system uses deep learning (DL) models to classify PNs (benign vs. malignant) with weak annotations, which eliminates the need for time-consuming and labor-intensive manual annotations of PNs. Moreover, the PN classification networks, augmented with handcrafted shape features acquired through the ball-scale transform technique, demonstrate capability to differentiate PNs with diverse labels, including pure ground-glass opacities, part-solid nodules, and solid nodules.

Results: The experiments were conducted on two lung CT datasets: (1) public LIDC-IDRI dataset with 1,018 subjects, (2) In-house dataset with 2740 subjects. Through 5-fold cross-validation on two datasets, the system achieved the following results: (1) an Area Under Curve (AUC) of 0.938 for PN localization and an AUC of 0.912 for PN differential diagnosis on the LIDC-IDRI dataset of 814 testing cases, (2) an AUC of 0.943 for PN localization and an AUC of 0.815 for PN differential diagnosis on the in-house dataset of 822 testing cases. These results demonstrate comparable performance to full annotation-based diagnosis systems.

Conclusions: Our system can efficiently localize and differentially diagnose PNs even in resource-limited environments with good robustness across different grade and morphology sub-groups in the presence of deviations due to the size, shape, and texture of the nodule, indicating its potential for future clinical translation.

Summary: An anthropomorphic diagnosis system of pulmonary nodules (PN) based on deep learning and weak annotation was found to achieve comparable performance to full-annotation dataset-based diagnosis systems, significantly reducing the time and the cost associated with the annotation.

Key points: A fully automatic system for the diagnosis of PN in CT scans using a suitable deep learning model and weak annotations was developed to achieve comparable performance (AUC = 0.938 for PN localization, AUC = 0.912 for PN differential diagnosis) with the full-annotation based deep learning models, reducing around 30%∼80% of annotation time for the experts.The integration of the hand-crafted feature acquired from human experts (natural intelligence) into the deep learning networks and the fusion of the classification results of multi-scale networks can efficiently improve the PN classification performance across different diameters and sub-groups of the nodule.

Publication types

  • Preprint