Deep learning or radiomics based on CT for predicting the response of gastric cancer to neoadjuvant chemotherapy: a meta-analysis and systematic review

Front Oncol. 2024 Mar 27:14:1363812. doi: 10.3389/fonc.2024.1363812. eCollection 2024.

Abstract

Background: Artificial intelligence (AI) models, clinical models (CM), and the integrated model (IM) are utilized to evaluate the response to neoadjuvant chemotherapy (NACT) in patients diagnosed with gastric cancer.

Objective: The objective is to identify the diagnostic test of the AI model and to compare the accuracy of AI, CM, and IM through a comprehensive summary of head-to-head comparative studies.

Methods: PubMed, Web of Science, Cochrane Library, and Embase were systematically searched until September 5, 2023, to compile English language studies without regional restrictions. The quality of the included studies was evaluated using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) criteria. Forest plots were utilized to illustrate the findings of diagnostic accuracy, while Hierarchical Summary Receiver Operating Characteristic curves were generated to estimate sensitivity (SEN) and specificity (SPE). Meta-regression was applied to analyze heterogeneity across the studies. To assess the presence of publication bias, Deeks' funnel plot and an asymmetry test were employed.

Results: A total of 9 studies, comprising 3313 patients, were included for the AI model, with 7 head-to-head comparative studies involving 2699 patients. Across the 9 studies, the pooled SEN for the AI model was 0.75 (95% confidence interval (CI): 0.66, 0.82), and SPE was 0.77 (95% CI: 0.69, 0.84). Meta-regression was conducted, revealing that the cut-off value, approach to predicting response, and gold standard might be sources of heterogeneity. In the head-to-head comparative studies, the pooled SEN for AI was 0.77 (95% CI: 0.69, 0.84) with SPE at 0.79 (95% CI: 0.70, 0.85). For CM, the pooled SEN was 0.67 (95% CI: 0.57, 0.77) with SPE at 0.59 (95% CI: 0.54, 0.64), while for IM, the pooled SEN was 0.83 (95% CI: 0.79, 0.86) with SPE at 0.69 (95% CI: 0.56, 0.79). Notably, there was no statistical difference, except that IM exhibited higher SEN than AI, while maintaining a similar level of SPE in pairwise comparisons. In the Receiver Operating Characteristic analysis subgroup, the CT-based Deep Learning (DL) subgroup, and the National Comprehensive Cancer Network (NCCN) guideline subgroup, the AI model exhibited higher SEN but lower SPE compared to the IM. Conversely, in the training cohort subgroup and the internal validation cohort subgroup, the AI model demonstrated lower SEN but higher SPE than the IM. The subgroup analysis underscored that factors such as the number of cohorts, cohort type, cut-off value, approach to predicting response, and choice of gold standard could impact the reliability and robustness of the results.

Conclusion: AI has demonstrated its viability as a tool for predicting the response of GC patients to NACT Furthermore, CT-based DL model in AI was sensitive to extract tumor features and predict the response. The results of subgroup analysis also supported the above conclusions. Large-scale rigorously designed diagnostic accuracy studies and head-to-head comparative studies are anticipated.

Systematic review registration: PROSPERO, CRD42022377030.

Keywords: artificial intelligence; deep learning; gastric cancer; meta-analysis; neoadjuvant chemotherapy; radiomics.

Publication types

  • Systematic Review

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.