An Empirical Model to Predict the Diabetic Positive Using Stacked Ensemble Approach

Sivashankari R; Sudha M; Mohammad Kamrul Hasan; Rashid A Saeed; Suliman A Alsuhibany; Sayed Abdel-Khalek

doi:10.3389/fpubh.2021.792124

An Empirical Model to Predict the Diabetic Positive Using Stacked Ensemble Approach

Front Public Health. 2022 Jan 21:9:792124. doi: 10.3389/fpubh.2021.792124. eCollection 2021.

Authors

Sivashankari R¹, Sudha M¹, Mohammad Kamrul Hasan², Rashid A Saeed³, Suliman A Alsuhibany⁴, Sayed Abdel-Khalek^{5

6}

Affiliations

¹ School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore, India.
² Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia.
³ Department of Computer Engineering, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia.
⁴ Department of Computer Science, College of Computer, Qassim University, Buraydah, Saudi Arabia.
⁵ Mathematics and Statistics Department, College of Science, Taif University, Taif, Saudi Arabia.
⁶ Mathematics Department, Sohag University, Sohag, Egypt.

Abstract

Today, disease detection automation is widespread in healthcare systems. The diabetic disease is a significant problem that has spread widely all over the world. It is a genetic disease that causes trouble for human life throughout the lifespan. Every year the number of people with diabetes rises by millions, and this affects children too. The disease identification involves manual checking so far, and automation is a current trend in the medical field. Existing methods use a single algorithm for the prediction of diabetes. For complex problems, a single model is not enough because it may not be suitable for the input data or the parameters used in the approach. To solve complex problems, multiple algorithms are used. These multiple algorithms follow a homogeneous model or heterogeneous model. The homogeneous model means the same algorithm, but the model has been used multiple times. In the heterogeneous model, different algorithms are used. This paper adopts a heterogeneous ensemble model called the stacked ensemble model to predict whether a person has diabetes positively or negatively. This stacked ensemble model is advantageous in the prediction. Compared to other existing models such as logistic regression Naïve Bayes (72), (74.4), and LDA (81%), the proposed stacked ensemble model has achieved 93.1% accuracy in predicting blood sugar disease.

Keywords: KNN classifier; PIMA dataset; SVM and Gaussian Naïve Bayes; decision tree; gradient boosting; healthcare systems; random forest.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Bayes Theorem
Child
Diabetes Mellitus*
Humans