Prediction of 5-year overall survival of tongue cancer based machine learning

BMC Oral Health. 2023 Aug 13;23(1):567. doi: 10.1186/s12903-023-03255-w.

Abstract

Objective: We aimed to develop a 5-year overall survival prediction model for patients with oral tongue squamous cell carcinoma based on machine learning methods.

Subjects and methods: The data were obtained from electronic medical records of 224 OTSCC patients at the PLA General Hospital. A five-year overall survival prediction model was constructed using logistic regression, Support Vector Machines, Decision Tree, Random Forest, Extreme Gradient Boosting, and Light Gradient Boosting Machine. Model performance was evaluated according to the area under the curve (AUC) of the receiver operating characteristic curve. The output of the optimal model was explained using the Python package (SHapley Additive exPlanations, SHAP).

Results: After passing through the grid search and secondary modeling, the Light Gradient Boosting Machine was the best prediction model (AUC = 0.860). As explained by SHapley Additive exPlanations, N-stage, age, systemic inflammation response index, positive lymph nodes, plasma fibrinogen, lymphocyte-to-monocyte ratio, neutrophil percentage, and T-stage could perform a 5-year overall survival prediction for OTSCC. The 5-year survival rate was 42%.

Conclusion: The Light Gradient Boosting Machine prediction model predicted 5-year overall survival in OTSCC patients, and this predictive tool has potential prognostic implications for patients with OTSCC.

Keywords: Electronic medical records; Machine learning; Oral tongue squamous cell carcinoma; Overall survival; Prediction model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Carcinoma, Squamous Cell*
  • Fibrinogen
  • Hemostatics*
  • Humans
  • Machine Learning
  • Tongue Neoplasms*

Substances

  • Fibrinogen
  • Hemostatics