Predicting incident heart failure from population-based nationwide electronic health records: protocol for a model development and validation study

BMJ Open. 2024 Jan 22;14(1):e073455. doi: 10.1136/bmjopen-2023-073455.

Abstract

Introduction: Heart failure (HF) is increasingly common and associated with excess morbidity, mortality, and healthcare costs. Treatment of HF can alter the disease trajectory and reduce clinical events in HF. However, many cases of HF remain undetected until presentation with more advanced symptoms, often requiring hospitalisation. Predicting incident HF is challenging and statistical models are limited by performance and scalability in routine clinical practice. An HF prediction model implementable in nationwide electronic health records (EHRs) could enable targeted diagnostics to enable earlier identification of HF.

Methods and analysis: We will investigate a range of development techniques (including logistic regression and supervised machine learning methods) on routinely collected primary care EHRs to predict risk of new-onset HF over 1, 5 and 10 years prediction horizons. The Clinical Practice Research Datalink (CPRD)-GOLD dataset will be used for derivation (training and testing) and the CPRD-AURUM dataset for external validation. Both comprise large cohorts of patients, representative of the population of England in terms of age, sex and ethnicity. Primary care records are linked at patient level to secondary care and mortality data. The performance of the prediction model will be assessed by discrimination, calibration and clinical utility. We will only use variables routinely accessible in primary care.

Ethics and dissemination: Permissions for CPRD-GOLD and CPRD-AURUM datasets were obtained from CPRD (ref no: 21_000324). The CPRD ethical approval committee approved the study. The results will be submitted as a research paper for publication to a peer-reviewed journal and presented at peer-reviewed conferences.

Trial registration details: The study was registered on Clinical Trials.gov (NCT05756127). A systematic review for the project was registered on PROSPERO (registration number: CRD42022380892).

Keywords: Electronic health records; Heart failure; Prediction; Prevention; Primary Care; Screening.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Calibration
  • Electronic Health Records*
  • England
  • Ethnicity
  • Heart Failure* / diagnosis
  • Heart Failure* / epidemiology
  • Humans
  • Systematic Reviews as Topic

Associated data

  • ClinicalTrials.gov/NCT05756127