A novel optimized initial cluster center and enhanced objective function: Medical diagnosis through classification

Health Informatics J. 2020 Mar;26(1):539-562. doi: 10.1177/1460458219839629. Epub 2019 Apr 11.

Abstract

Medical diagnosis through classification is often critical as the medical datasets are multilabel in nature, that is, a patient may have more than one health condition: high blood pressure, obesity, and diabetes. The aim of this article is to improve the accuracy and performance of multilabel classification using multilabel feature selection and improved overlapping clustering method. The proposed system consists of Optimized Initial Cluster Centers and Enhanced Objective Function technique to reduce the number of iterations in the clustering process thereby improving the clustering performance and to improve the clustering accuracy which will result in improving the accuracy and performance of multilabel classification. Ratios of clustering distance to class distance and execution time are used as the evaluation metric for accuracy and total execution time is used as the evaluation metric for performance. Based on the different combination with the number of labels, attributes, instances, and number of clusters, different values of accuracy and performance are obtained. The results on all 10 datasets show that the proposed technique is superior to the current technique. Furthermore, on average, the proposed technique has improved the classification accuracy by 5%-7%. Furthermore, the performance of new technique is improved by decreasing the processing time by 0.5-1 s on average. The proposed system targets on improving the accuracy and performance of the multilabel classification for medical diagnosis, which consists of multilabel feature selection and enhanced overlapping clustering technique. This study provides an acceptable range of accuracy with improved processing time, which assists the doctors in medical diagnosis (high blood pressure, obesity, and diabetes) of patients.

Keywords: feature extraction; fuzzy C-means; knowledge discovery; multilabel classification; overlapping clustering.

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Humans