Use of survival support vector machine combined with random survival forest to predict the survival of nasopharyngeal carcinoma patients

Transl Cancer Res. 2023 Dec 31;12(12):3581-3590. doi: 10.21037/tcr-23-316. Epub 2023 Dec 21.

Abstract

Background: The Cox regression model is not sufficiently accurate to predict the survival prognosis of nasopharyngeal carcinoma (NPC) patients. It is impossible to calculate and rank the importance of impact factors due to the low predictive accuracy of the Cox regression model. So, we developed a system. Using the SEER (The Surveillance, Epidemiology, and End Results) database data on NPC patients, we proposed the use of random survival forest (RSF) and survival-support vector machine (SVM) from the machine learning methods to develop a survival prediction system specifically for NPC patients. This approach aimed to make up for the insufficiency of the Cox regression model. We also used the Cox regression model to validate the development of the nomogram and compared it with machine learning methods.

Methods: A total of 1,683 NPC patients were extracted from the SEER database from January 2010 to December 2015. We used R language for modeling work, established the nomogram of survival prognosis of NPC patients by Cox regression model, ranked the correlation of influencing factors by RSF model VIMP (variable important) method, developed a survival prognosis system for NPC patients based on survival-SVM, and used C-index for model evaluation and performance comparison.

Results: Although the Cox regression models can be developed to predict the prognosis of NPC patients, their accuracy was lower than that of machine learning methods. When we substituted the data for the Cox model, the C-index for the training set was only 0.740, and the C-index for the test set was 0.721. In contrast, the C index of the survival-SVM model was 0.785. The C-index of the RSF model was 0.729. The importance ranking of each variable could be obtained according to the VIMP method.

Conclusions: The prediction results from the Cox model are not as good as those of the RSF method and survival-SVM based on the machine learning method. For the survival prognosis of NPC patients, the machine learning method can be considered for clinical application.

Keywords: Survival analysis; The Surveillance, Epidemiology, and End Results (SEER); nasopharyngeal carcinoma (NPC); random survival forest (RSF); survival-support vector machine (survival-SVM).