Analysis and prediction of long-term survival using a clinically applicable risk score based on the Electronic Health Record

Int J Med Inform. 2024 Jul:187:105470. doi: 10.1016/j.ijmedinf.2024.105470. Epub 2024 Apr 30.

Abstract

Background: The long-term survival of a population assigned to a hospital can be essential to anticipate, manage, and provide appropriate hospital healthcare resources or lead preventive actions for high-risk mortality individuals. In this study, we discriminate which electronic health record variables are most relevant to predict the long-term survival of a population, and apply the results to identify high-risk mortality groups.

Materials and methods: A prospective cohort study was conducted on a population of 113,403 individuals alive on July 1st, 2018 from the General Hospital of Castellón (Spain). Considering electronic health record patients' variables and survival days from the start date of the study, a Kaplan-Meier analysis and a multivariate Cox regression model were performed, and a risk score based on Cox coefficients was applied to predict survival over 3 years.

Results: All significant covariates from the Cox model (91.5% c-index) were associated with increased mortality risk. Using the proposed risk score, Kaplan-Meier curves show that survival probability in the 3rd year is 99.23% (95% confidence interval (CI) 99.18-99.29) for the low-risk, 91.21% (95% CI 90.67-91.76) for medium-risk, 76.52% (95% CI 75.59-77.46) for the high-risk, and 48.61 % (95% CI 46.85-50.36) for the very high-risk groups.

Discussion: The Cox model obtained is highly predictive, and it has been found that some electronic health record variables little studied to date, such as Clinical Risk Groups, have a strong impact on survival. Regarding clinical application, the proposed risk score is particularly useful for identifying high-risk subpopulations within a large population.

Keywords: Cox model; Electronic health record; Mortality risk; Survival analysis.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Electronic Health Records* / statistics & numerical data
  • Female
  • Humans
  • Kaplan-Meier Estimate*
  • Male
  • Middle Aged
  • Proportional Hazards Models*
  • Prospective Studies
  • Risk Assessment / methods
  • Risk Factors
  • Spain / epidemiology