A supportive attribute-assisted discretization model for medical classification

Biomed Mater Eng. 2014;24(1):289-95. doi: 10.3233/BME-130810.

Abstract

Discretization of a continuous-valued symptom (attribute) in medical data set is a crucial preprocessing step for the medical classification task. This paper proposes a supportive attribute - assisted discretization (SAAD) model for medical diagnostic problems. The intent of this approach is to discover the best supportive symptom that correlates closely with the continuous-valued symptom being discretized and to conduct the discretization process using the significant supportive information that is provided by the best supportive symptom, because we hypothesize that a good discretization scheme should rely heavily on the interaction between a continuous-valued attribute and both its supportive attribute and the class attribute. SAAD can consider each continuous-valued symptom differently and intelligently, which allows it to be capable of minimizing the information lost and the data uncertainty. Hence, SAAD results in higher classification accuracy. Empirical experiments using ten real-life datasets from the UCI repository were conducted to compare the classification accuracy achieved by several prestigious classifiers with SAAD and other state-of-the-art discretization approaches. The experimental results demonstrate the effectiveness and usefulness of the proposed approach in enhancing the diagnostic accuracy.

Keywords: Discretization; bioinformatics; data preprocessing; medical classification; supportive attribute interdependence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • Data Mining
  • Databases, Factual
  • Diagnosis, Computer-Assisted
  • Disease / classification*
  • Humans
  • Models, Theoretical
  • Reproducibility of Results
  • Software*