Electronic Medical Record-Based Machine Learning Approach to Predict the Risk of 30-Day Adverse Cardiac Events After Invasive Coronary Treatment: Machine Learning Model Development and Validation

Osung Kwon; Wonjun Na; Heejun Kang; Tae Joon Jun; Jihoon Kweon; Gyung-Min Park; YongHyun Cho; Cinyoung Hur; Jungwoo Chae; Do-Yoon Kang; Pil Hyung Lee; Jung-Min Ahn; Duk-Woo Park; Soo-Jin Kang; Seung-Whan Lee; Cheol Whan Lee; Seong-Wook Park; Seung-Jung Park; Dong Hyun Yang; Young-Hak Kim

doi:10.2196/26801

Electronic Medical Record-Based Machine Learning Approach to Predict the Risk of 30-Day Adverse Cardiac Events After Invasive Coronary Treatment: Machine Learning Model Development and Validation

JMIR Med Inform. 2022 May 11;10(5):e26801. doi: 10.2196/26801.

Authors

Osung Kwon^#¹, Wonjun Na^#², Heejun Kang³, Tae Joon Jun³, Jihoon Kweon³, Gyung-Min Park⁴, YongHyun Cho⁵, Cinyoung Hur⁵, Jungwoo Chae⁵, Do-Yoon Kang³, Pil Hyung Lee³, Jung-Min Ahn³, Duk-Woo Park³, Soo-Jin Kang³, Seung-Whan Lee³, Cheol Whan Lee³, Seong-Wook Park³, Seung-Jung Park³, Dong Hyun Yang^#⁶, Young-Hak Kim^#³

Affiliations

¹ Division of Cardiology Department of Internal Medicine, Eunpyeong St Mary's Hospital, Catholic University of Korea, Seoul, Republic of Korea.
² Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
³ Division of Cardiology, Department of Internal Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
⁴ Division of Cardiology, Department of Internal Medicine, Ulsan University Hospital, University of Ulsan College of Medicine, Ulsan, Republic of Korea.
⁵ Artificial Intelligence Lab, Linewalks, Inc, Seoul, Republic of Korea.
⁶ Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.

^# Contributed equally.

PMID: 35544292
PMCID: PMC9133980
DOI: 10.2196/26801

Abstract

Background: Although there is a growing interest in prediction models based on electronic medical records (EMRs) to identify patients at risk of adverse cardiac events following invasive coronary treatment, robust models fully utilizing EMR data are limited.

Objective: We aimed to develop and validate machine learning (ML) models by using diverse fields of EMR to predict the risk of 30-day adverse cardiac events after percutaneous intervention or bypass surgery.

Methods: EMR data of 5,184,565 records of 16,793 patients at a quaternary hospital between 2006 and 2016 were categorized into static basic (eg, demographics), dynamic time-series (eg, laboratory values), and cardiac-specific data (eg, coronary angiography). The data were randomly split into training, tuning, and testing sets in a ratio of 3:1:1. Each model was evaluated with 5-fold cross-validation and with an external EMR-based cohort at a tertiary hospital. Logistic regression (LR), random forest (RF), gradient boosting machine (GBM), and feedforward neural network (FNN) algorithms were applied. The primary outcome was 30-day mortality following invasive treatment.

Results: GBM showed the best performance with area under the receiver operating characteristic curve (AUROC) of 0.99; RF had a similar AUROC of 0.98. AUROCs of FNN and LR were 0.96 and 0.93, respectively. GBM had the highest area under the precision-recall curve (AUPRC) of 0.80, and the AUPRCs of RF, LR, and FNN were 0.73, 0.68, and 0.63, respectively. All models showed low Brier scores of <0.1 as well as highly fitted calibration plots, indicating a good fit of the ML-based models. On external validation, the GBM model demonstrated maximal performance with an AUROC of 0.90, while FNN had an AUROC of 0.85. The AUROCs of LR and RF were slightly lower at 0.80 and 0.79, respectively. The AUPRCs of GBM, LR, and FNN were similar at 0.47, 0.43, and 0.41, respectively, while that of RF was lower at 0.33. Among the categories in the GBM model, time-series dynamic data demonstrated a high AUROC of >0.95, contributing majorly to the excellent results.

Conclusions: Exploiting the diverse fields of the EMR data set, the ML-based 30-day adverse cardiac event prediction models demonstrated outstanding results, and the applied framework could be generalized for various health care prediction models.

Keywords: adverse cardiac event; big data; coronary artery disease; electronic medical record; machine learning; mortality; prediction.

©Osung Kwon, Wonjun Na, Heejun Kang, Tae Joon Jun, Jihoon Kweon, Gyung-Min Park, YongHyun Cho, Cinyoung Hur, Jungwoo Chae, Do-Yoon Kang, Pil Hyung Lee, Jung-Min Ahn, Duk-Woo Park, Soo-Jin Kang, Seung-Whan Lee, Cheol Whan Lee, Seong-Wook Park, Seung-Jung Park, Dong Hyun Yang, Young-Hak Kim. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 11.05.2022.