A Predictive Model for the 10-year Overall Survival Status of Patients With Distant Metastases From Differentiated Thyroid Cancer Using XGBoost Algorithm-A Population-Based Analysis

Front Genet. 2022 Jul 8:13:896805. doi: 10.3389/fgene.2022.896805. eCollection 2022.

Abstract

Purpose: To explore clinical and non-clinical characteristics affecting the prognosis of patients with differentiated thyroid cancer with distant metastasis (DTCDM) and establish an accurate overall survival (OS) prognostic model. Patients and methods: Study subjects and related information were obtained from the National Cancer Institute's surveillance, epidemiology, and results database (SEER). Kaplan-Meier analysis, log-rank test, and univariate and multivariate Cox analysis were used to screen for factors influencing the OS of patients with DTCDM. Nine variables were introduced to build a machine learning (ML) model, receiver operating characteristic (ROC) was used to evaluate the recognition ability of the model, calibration plots were used to obtain prediction accuracy, and decision curve analysis (DCA) was used to estimate clinical benefit. Results: After applying the inclusion and exclusion criteria, a total of 3,060 patients with DTCDM were included in the survival analysis from 2004 to 2017. A machine learning prediction model was developed with nine variables: age at diagnosis, gender, race, tumor size, histology, regional lymph node metastasis, primary site surgery, radiotherapy, and chemotherapy. After excluding patients who survived <120 months, variables were sub-coded and machine learning was used to model OS prognosis in patients with DTCDM. Patients 6-50 years of age had the highest scores in the model. Other variables with high scores included small tumor size, male sex, and age 51-76. The AUC and calibration curves confirm that the XGBoost model has good performance. DCA shows that our model can be used to support clinical decision-making in a 10-years overall survival model. Conclusion: An artificial intelligence model was constructed using the XGBoost algorithms to predict the 10-years overall survival rate of patients with DTCDM. After model validation and evaluation, the model had good discriminative ability and high clinical value. This model could serve as a clinical tool to help inform treatment decisions for patients with DTCDM.

Keywords: SEER database; Xgboost algorithm; differentiated thyroid cancer; distant metastases; machine learning; predictive model.