Application of Machine Learning to the Prediction of Cancer-Associated Venous Thromboembolism

Simon Mantha; Subrata Chatterjee; Rohan Singh; John Cadley; Chester Poon; Avijit Chatterjee; Daniel Kelly; Michelle Sterpi; Gerald Soff; Jeffrey Zwicker; José Soria; Magdalena Ruiz; Andres Muñoz; Maria Arcila

doi:10.21203/rs.3.rs-2870367/v1

Application of Machine Learning to the Prediction of Cancer-Associated Venous Thromboembolism

Res Sq [Preprint]. 2023 May 8:rs.3.rs-2870367. doi: 10.21203/rs.3.rs-2870367/v1.

Authors

Affiliations

¹ Memorial Sloan Kettering Cancer Center.
² MSK.
³ Mount Sinai Hospital.
⁴ University of Miami Health System/Sylvester Comprehensive Cancer Center.
⁵ Biomedical Research Institute Sant Pau (IIB-Sant Pau).
⁶ Universidad Complutense.
⁷ Hospital General Universitario Gregorio Marañón.

Abstract

Venous thromboembolism (VTE) is a common and impactful complication of cancer. Several clinical prediction rules have been devised to estimate the risk of a thrombotic event in this patient population, however they are associated with limitations. We aimed to develop a predictive model of cancer-associated VTE using machine learning as a means to better integrate all available data, improve prediction accuracy and allow applicability regardless of timing for systemic therapy administration. A retrospective cohort was used to fit and validate the models, consisting of adult patients who had next generation sequencing performed on their solid tumor for the years 2014 to 2019. A deep learning survival model limited to demographic, cancer-specific, laboratory and pharmacological predictors was selected based on results from training data for 23,800 individuals and was evaluated on an internal validation set including 5,951 individuals, yielding a time-dependent concordance index of 0.72 (95% CI = 0.70-0.74) for the first 6 months of observation. Adapted models also performed well overall compared to the Khorana Score (KS) in two external cohorts of individuals starting systemic therapy; in an external validation set of 1,250 patients, the C-index was 0.71 (95% CI = 0.65-0.77) for the deep learning model vs 0.66 (95% CI = 0.59-0.72) for the KS and in a smaller external cohort of 358 patients the C-index was 0.59 (95% CI = 0.50-0.69) for the deep learning model vs 0.56 (95% CI = 0.48-0.64) for the KS. The proportions of patients accurately reclassified by the deep learning model were 25% and 26% respectively. In this large cohort of patients with a broad range of solid malignancies and at different phases of systemic therapy, the use of deep learning resulted in improved accuracy for VTE incidence predictions. Additional studies are needed to further assess the validity of this model.

Publication types

Preprint