Many Models, Little Adoption-What Accounts for Low Uptake of Machine Learning Models for Atrial Fibrillation Prediction and Detection?

Yuki Kawamura; Alireza Vafaei Sadr; Vida Abedi; Ramin Zand

doi:10.3390/jcm13051313

Many Models, Little Adoption-What Accounts for Low Uptake of Machine Learning Models for Atrial Fibrillation Prediction and Detection?

J Clin Med. 2024 Feb 26;13(5):1313. doi: 10.3390/jcm13051313.

Authors

Yuki Kawamura¹, Alireza Vafaei Sadr², Vida Abedi², Ramin Zand³

Affiliations

¹ School of Clinical Medicine, University of Cambridge, Cambridge CB3 0SP, UK.
² Department of Public Health Sciences, College of Medicine, The Pennsylvania State University, Hershey, PA 17033, USA.
³ Department of Neurology, College of Medicine, The Pennsylvania State University, Hershey, PA 17033, USA.

Abstract

(1) Background: Atrial fibrillation (AF) is a major risk factor for stroke and is often underdiagnosed, despite being present in 13-26% of ischemic stroke patients. Recently, a significant number of machine learning (ML)-based models have been proposed for AF prediction and detection for primary and secondary stroke prevention. However, clinical translation of these technological innovations to close the AF care gap has been scant. Herein, we sought to systematically examine studies, employing ML models to predict incident AF in a population without prior AF or to detect paroxysmal AF in stroke cohorts to identify key reasons for the lack of translation into the clinical workflow. We conclude with a set of recommendations to improve the clinical translatability of ML-based models for AF. (2) Methods: MEDLINE, Embase, Web of Science, Clinicaltrials.gov, and ICTRP databases were searched for relevant articles from the inception of the databases up to September 2022 to identify peer-reviewed articles in English that used ML methods to predict incident AF or detect AF after stroke and reported adequate performance metrics. The search yielded 2815 articles, of which 16 studies using ML models to predict incident AF and three studies focusing on ML models to detect AF post-stroke were included. (3) Conclusions: This study highlights that (1) many models utilized only a limited subset of variables available from patients' health records; (2) only 37% of models were externally validated, and stratified analysis was often lacking; (3) 0% of models and 53% of datasets were explicitly made available, limiting reproducibility and transparency; and (4) data pre-processing did not include bias mitigation and sufficient details, leading to potential selection bias. Low generalizability, high false alarm rate, and lack of interpretability were identified as additional factors to be addressed before ML models can be widely deployed in the clinical care setting. Given these limitations, our recommendations to improve the uptake of ML models for better AF outcomes include improving generalizability, reducing potential systemic biases, and investing in external validation studies whilst developing a transparent modeling pipeline to ensure reproducibility.

Keywords: artificial intelligence; atrial fibrillation; decision trees; detection; machine learning; neural networks; prevention; stroke.

Publication types

Review

Grants and funding

This research received no external funding.