A machine learning approach on multiscale texture analysis for breast microcalcification diagnosis

BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):91. doi: 10.1186/s12859-020-3358-4.

Abstract

Background: Screening programs use mammography as primary diagnostic tool for detecting breast cancer at an early stage. The diagnosis of some lesions, such as microcalcifications, is still difficult today for radiologists. In this paper, we proposed an automatic binary model for discriminating tissue in digital mammograms, as support tool for the radiologists. In particular, we compared the contribution of different methods on the feature selection process in terms of the learning performances and selected features.

Results: For each ROI, we extracted textural features on Haar wavelet decompositions and also interest points and corners detected by using Speeded Up Robust Feature (SURF) and Minimum Eigenvalue Algorithm (MinEigenAlg). Then a Random Forest binary classifier is trained on a subset of a sub-set features selected by two different kinds of feature selection techniques, such as filter and embedded methods. We tested the proposed model on 260 ROIs extracted from digital mammograms of the BCDR public database. The best prediction performance for the normal/abnormal and benign/malignant problems reaches a median AUC value of 98.16% and 92.08%, and an accuracy of 97.31% and 88.46%, respectively. The experimental result was comparable with related work performance.

Conclusions: The best performing result obtained with embedded method is more parsimonious than the filter one. The SURF and MinEigen algorithms provide a strong informative content useful for the characterization of microcalcification clusters.

Keywords: Computer-aided diagnosis; Digital mammograms; Feature selection; Haar wavelet transform; Microcalcifications; Minimum eigenvalue algorithm; Random forest; SURF.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Breast Neoplasms / diagnosis
  • Breast* / diagnostic imaging
  • Calcinosis / diagnosis*
  • Databases, Factual
  • Female
  • Humans
  • Machine Learning*
  • Mammography
  • ROC Curve