Risk factors affecting patients survival with colorectal cancer in Morocco : Survival Analysis using an Interpretable Machine Learning Approach

Res Sq [Preprint]. 2023 Jan 10:rs.3.rs-2435106. doi: 10.21203/rs.3.rs-2435106/v1.

Abstract

The aim of our study was to assess the overall survival rates for colorectal patients in Morocco and to identify strong prognostic factors using a novel approach combining survival random forest and the Cox model. Covariate selection was performed using the variable importance based on permutation and partial dependence plots were displayed to explore in depth the relationship between the estimated partial effect of a given predictor and survival rates. The predictive performance was measured by two metrics, the Concordance Index (C-index) and the Brier Score (BS). Overall survival rates at 1, 2 and 3 years were, respectively, 87% (SE = 0.02; CI-95% = 0.84-0.91), 77% (SE = 0.02; CI-95% = 0.73-0.82) and 60% (SE = 0.03; CI-95% = 0.54-0.66). In the Cox model after adjustment for all covariates, sex, tumor differentiation had no significant effect on prognosis, but rather tumor site had a significant effect. The variable importance obtained from RSF strengthens that surgery, stage, insurance, residency, and age were the most important prognostic factors. The discriminative capacity of the Cox PH and RSF was, respectively, 0.771 and 0.798 for the C-index, while the accuracy of the Cox PH and RSF was, respectively, 0.257 and 0.207 for the Brier Score. This shows that RSF had both better discriminative capacity and predictive accuracy. Our results show that patients who are older than 70, living in rural areas, without health insurance, at a distant stage and who have not had surgery constitute a subgroup of patients with poor prognosis.

Keywords: colorectal cancer; cox model; overall survival; partial dependance plots; prognostic factors; random survival forest; variable importance.

Publication types

  • Preprint