Diagnosing and grading gastric atrophy and intestinal metaplasia using semi-supervised deep learning on pathological images: development and validation study

Gastric Cancer. 2024 Mar;27(2):343-354. doi: 10.1007/s10120-023-01451-9. Epub 2023 Dec 14.

Abstract

Objective: Patients with gastric atrophy and intestinal metaplasia (IM) were at risk for gastric cancer, necessitating an accurate risk assessment. We aimed to establish and validate a diagnostic approach for gastric biopsy specimens using deep learning and OLGA/OLGIM for individual gastric cancer risk classification.

Methods: In this study, we prospectively enrolled 545 patients suspected of atrophic gastritis during endoscopy from 13 tertiary hospitals between December 22, 2017, to September 25, 2020, with a total of 2725 whole-slide images (WSIs). Patients were randomly divided into a training set (n = 349), an internal validation set (n = 87), and an external validation set (n = 109). Sixty patients from the external validation set were randomly selected and divided into two groups for an observer study, one with the assistance of algorithm results and the other without. We proposed a semi-supervised deep learning algorithm to diagnose and grade IM and atrophy, and we compared it with the assessments of 10 pathologists. The model's performance was evaluated based on the area under the curve (AUC), sensitivity, specificity, and weighted kappa value.

Results: The algorithm, named GasMIL, was established and demonstrated encouraging performance in diagnosing IM (AUC 0.884, 95% CI 0.862-0.902) and atrophy (AUC 0.877, 95% CI 0.855-0.897) in the external test set. In the observer study, GasMIL achieved an 80% sensitivity, 85% specificity, a weighted kappa value of 0.61, and an AUC of 0.953, surpassing the performance of all ten pathologists in diagnosing atrophy. Among the 10 pathologists, GasMIL's AUC ranked second in OLGA (0.729, 95% CI 0.625-0.833) and fifth in OLGIM (0.792, 95% CI 0.688-0.896). With the assistance of GasMIL, pathologists demonstrated improved AUC (p = 0.013), sensitivity (p = 0.014), and weighted kappa (p = 0.016) in diagnosing IM, and improved specificity (p = 0.007) in diagnosing atrophy compared to pathologists working alone.

Conclusion: GasMIL shows the best overall performance in diagnosing IM and atrophy when compared to pathologists, significantly enhancing their diagnostic capabilities.

Keywords: Atrophic gastritis; Diagnose; Semi-supervised deep learning; The operative link for gastric intestinal metaplasia assessment; The operative link for gastritis assessment.

Publication types

  • Randomized Controlled Trial

MeSH terms

  • Atrophy
  • Biopsy / methods
  • Deep Learning*
  • Gastritis, Atrophic* / diagnosis
  • Gastritis, Atrophic* / pathology
  • Gastroscopy / methods
  • Humans
  • Metaplasia / diagnostic imaging
  • Risk Factors
  • Stomach Neoplasms* / diagnosis
  • Stomach Neoplasms* / pathology