Machine learning-assisted decision-support models to better predict patients with calculous pyonephrosis

Transl Androl Urol. 2021 Feb;10(2):710-723. doi: 10.21037/tau-20-1208.

Abstract

Background: To develop a machine learning (ML)-assisted model capable of accurately identifying patients with calculous pyonephrosis before making treatment decisions by integrating multiple clinical characteristics.

Methods: We retrospectively collected data from patients with obstructed hydronephrosis who underwent retrograde ureteral stent insertion, percutaneous nephrostomy (PCN), or percutaneous nephrolithotomy (PCNL). The study cohort was divided into training and testing datasets in a 70:30 ratio for further analysis. We developed 5 ML-assisted models from 22 clinical features using logistic regression (LR), LR optimized by least absolute shrinkage and selection operator (Lasso) regularization (Lasso-LR), support vector machine (SVM), extreme gradient boosting (XGBoost), and random forest (RF). The area under the curve (AUC) was applied to determine the model with the highest discrimination. Decision curve analysis (DCA) was used to investigate the clinical net benefit associated with using the predictive models.

Results: A total of 322 patients were included, with 225 patients in the training dataset, and 97 patients in the testing dataset. The XGBoost model showed good discrimination with the AUC, accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of 0.981, 0.991, 0.962, 1.000, 1.000, and 0.989, respectively, followed by SVM [AUC =0.985, 95% confidence interval (CI): 0.970-1.000], Lasso-LR (AUC =0.977, 95% CI: 0.958-0.996), LR (AUC =0.936, 95% CI: 0.905-0.968), and RF (AUC =0.920, 95% CI: 0.870-0.970). Validation of the model showed that SVM yielded the highest AUC (0.977, 95% CI: 0.952-1.000), followed by Lasso-LR (AUC =0.959, 95% CI: 0.921-0.997), XGBoost (AUC =0.958, 95% CI: 0.902-1.000), LR (AUC =0.932, 95% CI: 0.878-0.987), and RF (AUC =0.868, 95% CI: 0.779-0.958) in the testing dataset.

Conclusions: Our ML-based models had good discrimination in predicting patients with obstructed hydronephrosis at high risk of harboring pyonephrosis, and the use of these models may be greatly beneficial to urologists in treatment planning, patient selection, and decision-making.

Keywords: Calculous pyonephrosis; hydronephrosis; machine learning (ML).