Identification of prognostic signatures in remnant gastric cancer through an interpretable risk model based on machine learning: a multicenter cohort study

BMC Cancer. 2024 Apr 30;24(1):547. doi: 10.1186/s12885-024-12303-9.

Abstract

Objective: The purpose of this study was to develop an individual survival prediction model based on multiple machine learning (ML) algorithms to predict survival probability for remnant gastric cancer (RGC).

Methods: Clinicopathologic data of 286 patients with RGC undergoing operation (radical resection and palliative resection) from a multi-institution database were enrolled and analyzed retrospectively. These individuals were split into training (80%) and test cohort (20%) by using random allocation. Nine commonly used ML methods were employed to construct survival prediction models. Algorithm performance was estimated by analyzing accuracy, precision, recall, F1-score, area under the receiver operating characteristic curve (AUC), confusion matrices, five-fold cross-validation, decision curve analysis (DCA), and calibration curve. The best model was selected through appropriate verification and validation and was suitably explained by the SHapley Additive exPlanations (SHAP) approach.

Results: Compared with the traditional methods, the RGC survival prediction models employing ML exhibited good performance. Except for the decision tree model, all other models performed well, with a mean ROC AUC above 0.7. The DCA findings suggest that the developed models have the potential to enhance clinical decision-making processes, thereby improving patient outcomes. The calibration curve reveals that all models except the decision tree model displayed commendable predictive performance. Through CatBoost-based modeling and SHAP analysis, the five-year survival probability is significantly influenced by several factors: the lymph node ratio (LNR), T stage, tumor size, resection margins, perineural invasion, and distant metastasis.

Conclusions: This study established predictive models for survival probability at five years in RGC patients based on ML algorithms which showed high accuracy and applicative value.

Keywords: Interpretable; Machine learning; Multicenter; Prediction model; Prognosis; Remnant gastric cancer.

Publication types

  • Multicenter Study

MeSH terms

  • Aged
  • Algorithms
  • Female
  • Gastrectomy
  • Gastric Stump / pathology
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Prognosis
  • ROC Curve
  • Retrospective Studies
  • Risk Assessment / methods
  • Stomach Neoplasms* / mortality
  • Stomach Neoplasms* / pathology
  • Stomach Neoplasms* / surgery