Development and Validation of a Machine Learning Predictive Model for Cardiac Surgery-Associated Acute Kidney Injury

J Clin Med. 2023 Feb 1;12(3):1166. doi: 10.3390/jcm12031166.

Abstract

Objective: We aimed to develop and validate a predictive machine learning (ML) model for cardiac surgery associated with acute kidney injury (CSA-AKI) based on a multicenter randomized control trial (RCT) and a Medical Information Mart for Intensive Care-IV (MIMIC-IV) dataset.

Methods: This was a subanalysis from a completed RCT approved by the Ethics Committee of Fuwai Hospital in Beijing, China (NCT03782350). Data from Fuwai Hospital were randomly assigned, with 80% for the training dataset and 20% for the testing dataset. The data from three other centers were used for the external validation dataset. Furthermore, the MIMIC-IV dataset was also utilized to validate the performance of the predictive model. The area under the receiver operating characteristic curve (ROC-AUC), the precision-recall curve (PR-AUC), and the calibration brier score were applied to evaluate the performance of the traditional logistic regression (LR) and eleven ML algorithms. Additionally, the Shapley Additive Explanations (SHAP) interpreter was used to explain the potential risk factors for CSA-AKI.

Result: A total of 6495 eligible patients undergoing cardiopulmonary bypass (CPB) were eventually included in this study, 2416 of whom were from Fuwai Hospital (Beijing), for model development, 562 from three other cardiac centers in China, and 3517 from the MIMICIV dataset, were used, respectively, for external validation. The CatBoostClassifier algorithms outperformed other models, with excellent discrimination and calibration performance for the development, as well as the MIMIC-IV, datasets. In addition, the CatBoostClassifier achieved ROC-AUCs of 0.85, 0.67, and 0.77 and brier scores of 0.14, 0.19, and 0.16 in the testing, external, and MIMIC-IV datasets, respectively. Moreover, the utmost important risk factor, the N-terminal brain sodium peptide (NT-proBNP), was confirmed by the LASSO method in the feature section process. Notably, the SHAP explainer identified that the preoperative blood urea nitrogen level, prothrombin time, serum creatinine level, total bilirubin level, and age were positively correlated with CSA-AKI; preoperative platelets level, systolic and diastolic blood pressure, albumin level, and body weight were negatively associated with CSA-AKI.

Conclusions: The CatBoostClassifier algorithms outperformed other ML models in the discrimination and calibration of CSA-AKI prediction cardiac surgery with CPB, based on a multicenter RCT and MIMIC-IV dataset. Moreover, the preoperative NT-proBNP level was confirmed to be strongly related to CSA-AKI.

Keywords: acute kidney injury; cardiac surgery; external validation; logistic regression; machine learning.

Grants and funding

This work was supported by the National Natural Science Foundation of China (no. 81970290) and the Clinical Research Foundation of Fuwai Hospital (no. 2016-ZX09). This study is also supported by the Research Projects on Prevention and Control of Major Chronic Noninfectious Diseases, National Key Research and Development Program (no. 2016YFC1302000).