Data mining techniques in breast cancer diagnosis at the cellular-molecular level

J Cancer Res Clin Oncol. 2023 Nov;149(14):12605-12620. doi: 10.1007/s00432-023-05090-6. Epub 2023 Jul 14.

Abstract

Introduction: Studies in the field of better diagnosis of breast cancer using machine learning and data mining techniques have always been promising. A new diagnostic method can detect the characteristics of breast cancer in the early stages and help in better treatment. The aim of this study is to provide a method for early detection of breast cancer by reducing human errors based on data mining techniques in medicine using accurate and rapid screening.

Methodology: The proposed method includes data pre-processing and image quality improvement in the first step. The second step consists of separating cancer cells from healthy breast tissue and removing outliers using image segmentation. Finally, a classification model is configured by combining deep neural networks in the third phase. The proposed ensemble classification model uses several effective features extracted from images and is based on majority vote. This model can be used as a screening system to diagnose the grade of invasive ductal carcinoma of the breast.

Results: Evaluations have been done using two histopathological microscopic datasets including patients with invasive ductal carcinoma of the breast. With extracting high-level features with average accuracies of 92.65% and 93.34% in these two datasets, the proposed method has succeeded in quickly diagnosing and classifying breast cancer with high performance.

Conclusion: By combining deep neural networks and extracting features affecting breast cancer, the ability to diagnose with the highest accuracy is provided, and this is a step toward helping specialists and increasing the chances of patients' survival.

Keywords: Breast cancer; Data mining; Deep neural networks; Ensemble classification; Image segmentation.