Prediction Model of Ocular Metastases in Gastric Adenocarcinoma: Machine Learning-Based Development and Interpretation Study

Technol Cancer Res Treat. 2024 Jan-Dec:23:15330338231219352. doi: 10.1177/15330338231219352.

Abstract

Background: Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. Methods: This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model's predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. Results: The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca2+. Additionally, we validated the model's reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. Conclusion: We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.

Keywords: Shapley additive explanations; gradient boosting machine; machine learning; ocular metastases in gastric adenocarcinoma; risk factor.

MeSH terms

  • Adenocarcinoma*
  • Algorithms
  • Eye Neoplasms*
  • Humans
  • Machine Learning
  • Reproducibility of Results
  • Retrospective Studies
  • Stomach Neoplasms*