Machine learning in a real-world PFO study: analysis of data from multi-centers in China

BMC Med Inform Decis Mak. 2022 Nov 24;22(1):305. doi: 10.1186/s12911-022-02048-5.

Abstract

Purpose: The association of patent foreman ovale (PFO) and cryptogenic stroke has been studied for years. Although device closure overall decreases the risk for recurrent stroke, treatment effects varied across different studies. In this study, we aimed to detect sub-clusters in post-closure PFO patients and identify potential predictors for adverse outcomes.

Methods: We analyzed patients with embolic stroke of undetermined sources and PFO from 7 centers in China. Machine learning and Cox regression analysis were used.

Results: Using unsupervised hierarchical clustering on principal components, two main clusters were identified and a total of 196 patients were included. The average age was 42.7 (12.37) years and 64.80% (127/196) were female. During a median follow-up of 739 days, 12 (6.9%) adverse events happened, including 6 (3.45%) recurrent stroke, 5 (2.87%) transient ischemic attack (TIA) and one death (0.6%). Compared to cluster 1 (n = 77, 39.20%), patients in cluster 2 (n = 119, 60.71%) were more likely to be male, had higher systolic and diastolic blood pressure, higher body mass index, lower high-density lipoprotein cholesterol and increased proportion of presence of atrial septal aneurysm. Using random forest survival (RFS) analysis, eight top ranking features were selected and used for prediction model construction. As a result, the RFS model outperformed the traditional Cox regression model (C-index: 0.87 vs. 0.54).

Conclusions: There were 2 main clusters in post-closure PFO patients. Traditional cardiovascular profiles remain top ranking predictors for future recurrence of stroke or TIA. However, whether maximizing the management of these factors would provide extra benefits warrants further investigations.

Keywords: Device closure; Machine learning; Patent foreman ovale; Recurrent stroke; Transient ischemic attack.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • China / epidemiology
  • Cluster Analysis
  • Female
  • Humans
  • Ischemic Attack, Transient*
  • Machine Learning
  • Male
  • Stroke* / epidemiology
  • Stroke* / therapy