Machine learning-based identification of risk-factor signatures for undiagnosed atrial fibrillation in primary prevention and post-stroke in clinical practice

Eur Heart J Qual Care Clin Outcomes. 2022 Dec 13;9(1):16-23. doi: 10.1093/ehjqcco/qcac013.

Abstract

Aims: Atrial fibrillation (AF) carries a substantial risk of ischemic stroke and other complications, and estimates suggest that over a third of cases remain undiagnosed. AF detection is particularly pressing in stroke survivors. To tailor AF screening efforts, we explored German health claims data for routinely available predictors of incident AF in primary care and post-stroke using machine learning methods.

Methods and results: We combined AF predictors in patients over 45 years of age using claims data in the InGef database (n = 1 476 391) for (i) incident AF and (ii) AF post-stroke, using machine learning techniques. Between 2013-2016, new-onset AF was diagnosed in 98 958 patients (6.7%). Published risk factors for AF including male sex, hypertension, heart failure, valvular heart disease, and chronic kidney disease were confirmed. Component-wise gradient boosting identified additional predictors for AF from ICD-codes available in ambulatory care. The area under the curve (AUC) of the final, condensed model consisting of 13 predictors, was 0.829 (95% confidence interval (CI) 0.826-0.833) in the internal validation, and 0.755 (95% CI 0.603-0.890) in a prospective validation cohort (n = 661). The AUC for post-stroke AF was of 0.67 (95% CI 0.651-0.689) in the internal validation data set, and 0.766 (95% CI 0.731-0.800) in the prospective clinical cohort.

Conclusion: ICD-coded clinical variables selected by machine learning can improve the identification of patients at risk of newly diagnosed AF. Using this readily available, automatically coded information can target AF screening efforts to identify high-risk populations in primary care and stroke survivors.

Keywords: Atrial fibrillation; artificial intelligence; machine learning; risk prediction; stroke.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Atrial Fibrillation* / complications
  • Atrial Fibrillation* / diagnosis
  • Atrial Fibrillation* / epidemiology
  • Humans
  • Machine Learning
  • Male
  • Primary Prevention
  • Risk Assessment
  • Risk Factors
  • Stroke* / diagnosis
  • Stroke* / epidemiology
  • Stroke* / etiology