Predictive model for the 5-year survival status of osteosarcoma patients based on the SEER database and XGBoost algorithm

Sci Rep. 2021 Mar 10;11(1):5542. doi: 10.1038/s41598-021-85223-4.

Abstract

Osteosarcoma is the most common bone malignancy, with the highest incidence in children and adolescents. Survival rate prediction is important for improving prognosis and planning therapy. However, there is still no prediction model with a high accuracy rate for osteosarcoma. Therefore, we aimed to construct an artificial intelligence (AI) model for predicting the 5-year survival of osteosarcoma patients by using extreme gradient boosting (XGBoost), a large-scale machine-learning algorithm. We identified cases of osteosarcoma in the Surveillance, Epidemiology, and End Results (SEER) Research Database and excluded substandard samples. The study population was 835 and was divided into the training set (n = 668) and validation set (n = 167). Characteristics selected via survival analyses were used to construct the model. Receiver operating characteristic (ROC) curve and decision curve analyses were performed to evaluate the prediction. The accuracy of the prediction model was excellent both in the training set (area under the ROC curve [AUC] = 0.977) and the validation set (AUC = 0.911). Decision curve analyses proved the model could be used to support clinical decisions. XGBoost is an effective algorithm for predicting 5-year survival of osteosarcoma patients. Our prediction model had excellent accuracy and is therefore useful in clinical settings.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bone Neoplasms / mortality*
  • Databases, Factual*
  • Disease-Free Survival
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Models, Biological*
  • Osteosarcoma / mortality*
  • Predictive Value of Tests
  • SEER Program
  • Survival Rate