A deep-learning based system using multi-modal data for diagnosing gastric neoplasms in real-time (with video)

Gastric Cancer. 2023 Mar;26(2):275-285. doi: 10.1007/s10120-022-01358-x. Epub 2022 Dec 15.

Abstract

Background: White light (WL) and weak-magnifying (WM) endoscopy are both important methods for diagnosing gastric neoplasms. This study constructed a deep-learning system named ENDOANGEL-MM (multi-modal) aimed at real-time diagnosing gastric neoplasms using WL and WM data.

Methods: WL and WM images of a same lesion were combined into image-pairs. A total of 4201 images, 7436 image-pairs, and 162 videos were used for model construction and validation. Models 1-5 including two single-modal models (WL, WM) and three multi-modal models (data fusion on task-level, feature-level, and input-level) were constructed. The models were tested on three levels including images, videos, and prospective patients. The best model was selected for constructing ENDOANGEL-MM. We compared the performance between the models and endoscopists and conducted a diagnostic study to explore the ENDOANGEL-MM's assistance ability.

Results: Model 4 (ENDOANGEL-MM) showed the best performance among five models. Model 2 performed better in single-modal models. The accuracy of ENDOANGEL-MM was higher than that of Model 2 in still images, real-time videos, and prospective patients. (86.54 vs 78.85%, P = 0.134; 90.00 vs 85.00%, P = 0.179; 93.55 vs 70.97%, P < 0.001). Model 2 and ENDOANGEL-MM outperformed endoscopists on WM data (85.00 vs 71.67%, P = 0.002) and multi-modal data (90.00 vs 76.17%, P = 0.002), significantly. With the assistance of ENDOANGEL-MM, the accuracy of non-experts improved significantly (85.75 vs 70.75%, P = 0.020), and performed no significant difference from experts (85.75 vs 89.00%, P = 0.159).

Conclusions: The multi-modal model constructed by feature-level fusion showed the best performance. ENDOANGEL-MM identified gastric neoplasms with good accuracy and has a potential role in real-clinic.

Keywords: Deep learning; Gastric neoplasms; Multi-modal fusion; Weak magnification endoscopy.

Publication types

  • Video-Audio Media

MeSH terms

  • Deep Learning*
  • Endoscopy, Gastrointestinal
  • Humans
  • Prospective Studies
  • Stomach Neoplasms* / pathology