The Development of a Prediction Model Based on Random Survival Forest for the Postoperative Prognosis of Pancreatic Cancer: A SEER-Based Study

Cancers (Basel). 2022 Sep 25;14(19):4667. doi: 10.3390/cancers14194667.

Abstract

Accurate prediction for the prognosis of patients with pancreatic cancer (PC) is a emerge task nowadays. We aimed to develop survival models for postoperative PC patients, based on a novel algorithm, random survival forest (RSF), traditional Cox regression and neural networks (Deepsurv), using the Surveillance, Epidemiology, and End Results Program (SEER) database. A total of 3988 patients were included in this study. Eight clinicopathological features were selected using least absolute shrinkage and selection operator (LASSO) regression analysis and were utilized to develop the RSF model. The model was evaluated based on three dimensions: discrimination, calibration, and clinical benefit. It found that the RSF model predicted the cancer-specific survival (CSS) of the postoperative PC patients with a c-index of 0.723, which was higher than the models built by Cox regression (0.670) and Deepsurv (0.700). The Brier scores at 1, 3, and 5 years (0.188, 0.177, and 0.131) of the RSF model demonstrated the model's favorable calibration and the decision curve analysis illustrated the model's value of clinical implement. Moreover, the roles of the key variables were visualized in the Shapley Additive Explanations plotting. Lastly, the prediction model demonstrates value in risk stratification and individual prognosis. In this study, a high-performance prediction model for PC postoperative prognosis was developed, based on RSF The model presented significant strengths in the risk stratification and individual prognosis prediction.

Keywords: machine learning; pancreatic cancer; random survival forest; surgery; the Surveillance, Epidemiology, and End Results Program (SEER); visualization.