Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning

Med Biol Eng Comput. 2022 Sep;60(9):2589-2600. doi: 10.1007/s11517-022-02623-y. Epub 2022 Jul 4.

Abstract

This paper presents a comparative evaluation of classification algorithms using Waikato Environment for Knowledge Analysis (WEKA) software. The main goal of the paper is to conduct a comprehensive comparison and determine which predictive modelling technique is best for the problem of classifying breast cancer recurrence. The dataset for this study consists of 286 instances (201 instances belong to recurrence class and 85 instances belong to non-recurrence class) and 10 attributes. Comparison analysis is conducted for Naïve Bayes, J48, K*, Random Forest, Multilayer Perceptron (MLP) and Support Vector Machine (SVM) models using different parameters. The performance of the developed models is calculated using the following evaluation metrics: accuracy, precision, sensitivity, specificity, mean absolute error, ROC curves and AUC values. Contribution of the attributes to the classification models is assessed by measuring information gain. Results show that J48 model and the SVM algorithm give the highest accuracy, which is 75.5% and 79.6%, respectively. Implementation of SVM algorithm also shows the highest sensitivity of 99%, while the highest precision is obtained by MLP algorithm which is 79%. In addition, SVM algorithm possesses the lowest mean absolute error. Furthermore, by measuring information gain, it is revealed that a degree of malignant tumour contributes more than other attributes to recurrence of breast cancer.

Keywords: Breast cancer; J48; Machine learning; Medical imaging; Multilayer Perceptron.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Breast Neoplasms* / diagnosis
  • Female
  • Humans
  • Machine Learning
  • Neoplasm Recurrence, Local
  • Support Vector Machine