ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application

J Pathol Inform. 2024 Feb 22:15:100371. doi: 10.1016/j.jpi.2024.100371. eCollection 2024 Dec.

Abstract

Chronic kidney diseases (CKDs) are a significant public health issue with potential for severe complications such as hypertension, anemia, and renal failure. Timely diagnosis is crucial for effective management. Leveraging machine learning within healthcare offers promising advancements in predictive diagnostics. In this paper, we developed a machine learning-based kidney diseases prediction (ML-CKDP) model with dual objectives: to enhance dataset preprocessing for CKD classification and to develop a web-based application for CKD prediction. The proposed model involves a comprehensive data preprocessing protocol, converting categorical variables to numerical values, imputing missing data, and normalizing via Min-Max scaling. Feature selection is executed using a variety of techniques including Correlation, Chi-Square, Variance Threshold, Recursive Feature Elimination, Sequential Forward Selection, Lasso Regression, and Ridge Regression to refine the datasets. The model employs seven classifiers: Random Forest (RF), AdaBoost (AdaB), Gradient Boosting (GB), XgBoost (XgB), Naive Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT), to predict CKDs. The effectiveness of the models is assessed by measuring their accuracy, analyzing confusion matrix statistics, and calculating the Area Under the Curve (AUC) specifically for the classification of positive cases. Random Forest (RF) and AdaBoost (AdaB) achieve a 100% accuracy rate, evident across various validation methods including data splits of 70:30, 80:20, and K-Fold set to 10 and 15. RF and AdaB consistently reach perfect AUC scores of 100% across multiple datasets, under different splitting ratios. Moreover, Naive Bayes (NB) stands out for its efficiency, recording the lowest training and testing times across all datasets and split ratios. Additionally, we present a real-time web-based application to operationalize the model, enhancing accessibility for healthcare practitioners and stakeholders. Web app link: https://rajib-research-kedney-diseases-prediction.onrender.com/.

Keywords: Chronic kidney diseases; Classification; Feature selection; Machine learning.