Use of electronic health record data mining for heart failure subtyping

BMC Res Notes. 2023 Sep 11;16(1):208. doi: 10.1186/s13104-023-06469-x.

Abstract

Objective: To assess whether electronic health record (EHR) data text mining can be used to improve register-based heart failure (HF) subtyping. EHR data of 43,405 individuals from two Finnish hospital biobanks were mined for unstructured text mentions of ejection fraction (EF) and validated against clinical assessment in two sets of 100 randomly selected individuals. Structured laboratory data was then incorporated for a categorization by HF subtype (HF with mildly reduced EF, HFmrEF; HF with preserved EF, HFpEF; HF with reduced EF, HFrEF; and no HF).

Results: In 86% of the cases, the algorithm-identified EF belonged to the correct HF subtype range. Sensitivity, specificity, PPV and NPV of the algorithm were 94-100% for HFrEF, 85-100% for HFmrEF, and 96%, 67%, 53% and 98% for HFpEF. Survival analyses using the traditional diagnosis of HF were in concordance with the algorithm-based ones. Compared to healthy individuals, mortality increased from HFmrEF (hazard ratio [HR], 1.91; 95% confidence interval [CI], 1.24-2.95) to HFpEF (2.28; 1.80-2.88) to HFrEF group (2.63; 1.97-3.50) over a follow-up of 1.5 years. We conclude that quantitative EF data can be efficiently extracted from EHRs and used with laboratory data to subtype HF with reasonable accuracy, especially for HFrEF.

Keywords: Data mining; Ejection fraction; Electronic health records; HFmrEF; HFpEF; HFrEF; Heart failure; Text mining.

MeSH terms

  • Algorithms
  • Data Mining
  • Electronic Health Records
  • Heart Failure* / diagnosis
  • Humans
  • Stroke Volume