EAGA-MLP-An Enhanced and Adaptive Hybrid Classification Model for Diabetes Diagnosis

Sensors (Basel). 2020 Jul 20;20(14):4036. doi: 10.3390/s20144036.

Abstract

Disease diagnosis is a critical task which needs to be done with extreme precision. In recent times, medical data mining is gaining popularity in complex healthcare problems based disease datasets. Unstructured healthcare data constitutes irrelevant information which can affect the prediction ability of classifiers. Therefore, an effective attribute optimization technique must be used to eliminate the less relevant data and optimize the dataset for enhanced accuracy. Type 2 Diabetes, also called Pima Indian Diabetes, affects millions of people around the world. Optimization techniques can be applied to generate a reliable dataset constituting of symptoms that can be useful for more accurate diagnosis of diabetes. This study presents the implementation of a new hybrid attribute optimization algorithm called Enhanced and Adaptive Genetic Algorithm (EAGA) to get an optimized symptoms dataset. Based on readings of symptoms in the optimized dataset obtained, a possible occurrence of diabetes is forecasted. EAGA model is further used with Multilayer Perceptron (MLP) to determine the presence or absence of type 2 diabetes in patients based on the symptoms detected. The proposed classification approach was named as Enhanced and Adaptive-Genetic Algorithm-Multilayer Perceptron (EAGA-MLP). It is also implemented on seven different disease datasets to assess its impact and effectiveness. Performance of the proposed model was validated against some vital performance metrics. The results show a maximum accuracy rate of 97.76% and 1.12 s of execution time. Furthermore, the proposed model presents an F-Score value of 86.8% and a precision of 80.2%. The method is compared with many existing studies and it was observed that the classification accuracy of the proposed Enhanced and Adaptive-Genetic Algorithm-Multilayer Perceptron (EAGA-MLP) model clearly outperformed all other previous classification models. Its performance was also tested with seven other disease datasets. The mean accuracy, precision, recall and f-score obtained was 94.7%, 91%, 89.8% and 90.4%, respectively. Thus, the proposed model can assist medical experts in accurately determining risk factors of type 2 diabetes and thereby help in accurately classifying the presence of type 2 diabetes in patients. Consequently, it can be used to support healthcare experts in the diagnosis of patients affected by diabetes.

Keywords: F-Score; attribute optimization; classification; classification accuracy; diabetes; fitness function; genetic algorithm; mutation.

MeSH terms

  • Algorithms*
  • Data Mining
  • Diabetes Mellitus, Type 2* / diagnosis
  • Humans
  • Neural Networks, Computer*