Machine learning insight into the role of imaging and clinical variables for the prediction of obstructive coronary artery disease and revascularization: An exploratory analysis of the CONSERVE study

PLoS One. 2020 Jun 25;15(6):e0233791. doi: 10.1371/journal.pone.0233791. eCollection 2020.

Abstract

Background: Machine learning (ML) is able to extract patterns and develop algorithms to construct data-driven models. We use ML models to gain insight into the relative importance of variables to predict obstructive coronary artery disease (CAD) using the Coronary Computed Tomographic Angiography for Selective Cardiac Catheterization (CONSERVE) study, as well as to compare prediction of obstructive CAD to the CAD consortium clinical score (CAD2). We further perform ML analysis to gain insight into the role of imaging and clinical variables for revascularization.

Methods: For prediction of obstructive CAD, the entire ICA arm of the study, comprising 719 patients was used. For revascularization, 1,028 patients were randomized to invasive coronary angiography (ICA) or coronary computed tomographic angiography (CCTA). Data was randomly split into 80% training 20% test sets for building and validation. Models used extreme gradient boosting (XGBoost).

Results: Mean age was 60.6 ± 11.5 years and 64.3% were female. For the prediction of obstructive CAD, the AUC was significantly higher for ML at 0.779 (95% CI: 0.672-0.886) than for CAD2 (0.696 [95% CI: 0.594-0.798]) (P = 0.01). BMI, age, and angina severity were the most important variables. For revascularization, the model obtained an overall area under the receiver-operation curve (AUC) of 0.958 (95% CI = 0.933-0.983). Performance did not differ whether the imaging parameters used were from ICA (AUC 0.947, 95% CI = 0.903-0.990) or CCTA (AUC 0.941, 95% CI = 0.895-0.988) (P = 0.90). The ML model obtained sensitivity and specificity of 89.2% and 92.9%, respectively. Number of vessels with ≥70% stenosis, maximum segment stenosis severity (SSS) and body mass index (BMI) were the most important variables. Exclusion of imaging variables resulted in performance deterioration, with an AUC of 0.705 (95% CI 0.614-0.795) (P <0.0001).

Conclusions: For obstructive CAD, the ML model outperformed CAD2. BMI is an important variable, although currently not included in most scores. In this ML model, imaging variables were most associated with revascularization. Imaging modality did not influence model performance. Removal of imaging variables reduced model performance.

Publication types

  • Multicenter Study
  • Randomized Controlled Trial
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Coronary Angiography*
  • Coronary Artery Disease / diagnostic imaging*
  • Coronary Artery Disease / epidemiology
  • Coronary Artery Disease / pathology
  • Coronary Artery Disease / surgery
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Models, Statistical
  • Myocardial Revascularization / statistics & numerical data*

Grants and funding

This study was funded by the Dalio Institute of Cardiovascular Imaging (New York, NY, USA). James K. Min received funding from the Dalio Foundation, National Institutes of Health, and GE Healthcare. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.