CD-Surv: a contrastive-based model for dynamic survival analysis

Health Inf Sci Syst. 2022 Apr 12;10(1):5. doi: 10.1007/s13755-022-00173-z. eCollection 2022 Dec.

Abstract

Survival analysis, aimed at investigating the relationships between covariates and event time, has exhibited profound effects on health service management. Longitudinal data with sequential patterns, such as electronic health records (EHRs), contain a large volume of patient treatment trajectories, and therefore, provide great potential for survival analysis. However, most existing studies address the survival analysis problem in a static manner, that is, they only utilize a fraction of longitudinal data, ignore the correlations between multiple visits, and usually may not be able to capture the latent representations of patient treatment trajectories. This inevitably deteriorates the performance of the survival analysis. To address this challenge, we propose an end-to-end contrastive-based model CD-Surv to better understand the patient treatment trajectories and dynamically predict the survival probability of a target patient. Specifically, two data augmentation strategies, namely, mask generation and shuffle generation, are adopted to augment the real treatment trajectories documented in the EHR. Based on this, the hidden representations of the real trajectories can be improved by utilizing contrastive learning between augmented and real trajectories. We evaluated our proposed CD-Surv on two real-world datasets, and the experimental results indicated that our proposed model could outperform state-of-the-art baselines on various evaluation metrics.

Keywords: Contrastive learning; Electronic health records; Longitudinal data; Survival analysis.