[Comparison of prediction ability of two extended Cox models in nonlinear survival data analysis]

Nan Fang Yi Ke Da Xue Xue Bao. 2023 Jan 20;43(1):76-84. doi: 10.12122/j.issn.1673-4254.2023.01.10.
[Article in Chinese]

Abstract

Objective: To compare the predictive ability of two extended Cox models in nonlinear survival data analysis.

Methods: Through Monte Carlo simulation and empirical study and with the conventional Cox Proportional Hazards model and Random Survival Forests as the reference models, we compared restricted cubic spline Cox model (Cox_RCS) and DeepSurv neural network Cox model (Cox_DNN) for their prediction ability in nonlinear survival data analysis. Concordance index was used to evaluate the differentiation of the prediction results (a larger concordance index indicates a better prediction ability of the model). Integrated Brier Score was used to evaluate the calibration degree of the prediction (a smaller index indicates a better prediction ability).

Results: For data that met requirement of the proportion risk, the Cox_RCS model had the best prediction ability regardless of the sample size or deletion rate. For data that failed to meet the proportion risk, the prediction ability of Cox_DNN was optimal for a large sample size (≥500) with a low deletion (< 40%); the prediction ability of Cox_RCS was superior to those of other models in all other scenarios. For example data, the Cox_RCS model showed the best performance.

Conclusion: In analysis of nonlinear low maintenance data, Cox_RCS and Cox_DNN have their respective advantages and disadvantages in prediction. The conventional survival analysis methods are not inferior to machine learning or deep learning methods under certain conditions.

目的: 系统性地比较两类扩展Cox模型的预测能力,观察它们应用于非线性生存数据中的预测能力优劣。

方法: 通过蒙特卡罗模拟和实证研究从预测能力方面研究比较限制性立方样条Cox模型(Cox_ RCS),深度生存神经网络Cox模型(Cox_DNN)这两种方法的优劣;并以传统Cox模型(Cox)和随机生存森林(RSF)作为参照。其中预测的区分度评价指标采用一致性指数(C-index),该指标越大,模型预测能力越好;预测的校准度评价指标采用积分布莱尔评分(IBS),该指标越小,模型预测能力越好。

结果: 在数据满足比例风险的情况下,无论样本量和删失率大小,Cox_RCS的预测能力都是最好的。在数据不满足比例风险的情况下,Cox_DNN的预测能力在大样本(本文中≥500)、低删失(本文中 < 40%)时是最优的,其余情况Cox_RCS的预测能力优于其他模型。在实例数据中,Cox_RCS的表现是最优。

结论: 在含有非线性关系的低维生存数据中,Cox_RCS和Cox_DNN在预测能力上各有优劣。因此可根据实际数据条件选择合适的分析方法,传统的生存分析方法在特定条件下并不差于机器学习以及深度学习方法。

Keywords: Cox model; deep neural network; nonlinear correlation; restricted cubic spline; survival analysis.

Publication types

  • English Abstract

MeSH terms

  • Calibration
  • Computer Simulation
  • Data Analysis*
  • Proportional Hazards Models
  • Survival Analysis

Grants and funding

广东省自然科学基金(2022A1515012152);广东省组织构建与检测重点实验室开放基金(zzgjzd2021003)