Cardiovascular and Diabetes Diseases Classification Using Ensemble Stacking Classifiers with SVM as a Meta Classifier

Diagnostics (Basel). 2022 Oct 26;12(11):2595. doi: 10.3390/diagnostics12112595.

Abstract

Cardiovascular disease includes coronary artery diseases (CAD), which include angina and myocardial infarction (commonly known as a heart attack), and coronary heart diseases (CHD), which are marked by the buildup of a waxy material called plaque inside the coronary arteries. Heart attacks are still the main cause of death worldwide, and if not treated right they have the potential to cause major health problems, such as diabetes. If ignored, diabetes can result in a variety of health problems, including heart disease, stroke, blindness, and kidney failure. Machine learning methods can be used to identify and diagnose diabetes and other illnesses. Diabetes and cardiovascular disease both can be diagnosed using several classifier types. Naive Bayes, K-Nearest neighbor (KNN), linear regression, decision trees (DT), and support vector machines (SVM) were among the classifiers employed, although all of these models had poor accuracy. Therefore, due to a lack of significant effort and poor accuracy, new research is required to diagnose diabetes and cardiovascular disease. This study developed an ensemble approach called "Stacking Classifier" in order to improve the performance of integrated flexible individual classifiers and decrease the likelihood of misclassifying a single instance. Naive Bayes, KNN, Linear Discriminant Analysis (LDA), and Decision Tree (DT) are just a few of the classifiers used in this study. As a meta-classifier, Random Forest and SVM are used. The suggested stacking classifier obtains a superior accuracy of 0.9735 percent when compared to current models for diagnosing diabetes, such as Naive Bayes, KNN, DT, and LDA, which are 0.7646 percent, 0.7460 percent, 0.7857 percent, and 0.7735 percent, respectively. Furthermore, for cardiovascular disease, when compared to current models such as KNN, NB, DT, LDA, and SVM, which are 0.8377 percent, 0.8256 percent, 0.8426 percent, 0.8523 percent, and 0.8472 percent, respectively, the suggested stacking classifier performed better and obtained a higher accuracy of 0.8871 percent.

Keywords: KNN; Naive Bayes; cardiovascular disease; coronary heart diseases; decision tree; diabetes disease; meta-classifier; stacking classifier.

Grants and funding

The research work was partially sponsored and supported by the Faculty of Computing and Informatics, Multimedia University Malaysia.