Deep learning model for predicting the survival of patients with primary gastrointestinal lymphoma based on the SEER database and a multicentre external validation cohort

J Cancer Res Clin Oncol. 2023 Oct;149(13):12177-12189. doi: 10.1007/s00432-023-05123-0. Epub 2023 Jul 10.

Abstract

Purpose: Due to the rarity of primary gastrointestinal lymphoma (PGIL), the prognostic factors and optimal management of PGIL have not been clearly defined. We aimed to establish prognostic models using a deep learning algorithm for survival prediction.

Methods: We collected 11,168 PGIL patients from the Surveillance, Epidemiology, and End Results (SEER) database to form the training and test cohorts. At the same time, we collected 82 PGIL patients from three medical centres to form the external validation cohort. We constructed a Cox proportional hazards (CoxPH) model, random survival forest (RSF) model, and neural multitask logistic regression (DeepSurv) model to predict PGIL patients' overall survival (OS).

Results: The 1-, 3-, 5-, and 10-year OS rates of PGIL patients in the SEER database were 77.1%, 69.4%, 63.7%, and 50.3%, respectively. The RSF model based on all variables showed that the top three most important variables for predicting OS were age, histological type, and chemotherapy. The independent risk factors for PGIL patient prognosis included sex, age, race, primary site, Ann Arbor stage, histological type, symptom, radiotherapy, and chemotherapy, according to the Lasso regression analysis. Using these factors, we built the CoxPH and DeepSurv models. The DeepSurv model's C-index values were 0.760 in the training cohort, 0.742 in the test cohort, and 0.707 in the external validation cohort, which demonstrated that the DeepSurv model performed better compared to the RSF model (0.728) and the CoxPH model (0.724). The DeepSurv model accurately predicted 1-, 3-, 5- and 10-year OS. Both calibration curves and decision curve analysis curves demonstrated the superior performance of the DeepSurv model. We developed the DeepSurv model as an online web calculator for survival prediction, which can be accessed at http://124.222.228.112:8501/ .

Conclusions: This DeepSurv model with external validation is superior to previous studies in predicting short-term and long-term survival and can help us make better-individualized decisions for PGIL patients.

Keywords: Deep learning; DeepSurv; Machine learning; Survival analysis; primary gastrointestinal lymphoma.

Publication types

  • Multicenter Study
  • Validation Study

MeSH terms

  • Aged
  • Deep Learning*
  • Female
  • Gastrointestinal Neoplasms* / mortality
  • Humans
  • Logistic Models
  • Lymphoma* / mortality
  • Male
  • Middle Aged
  • Prognosis
  • Proportional Hazards Models
  • Random Forest
  • SEER Program
  • Survival Analysis*