Impact of image compression on deep learning-based mammogram classification

Sci Rep. 2021 Apr 12;11(1):7924. doi: 10.1038/s41598-021-86726-w.

Abstract

Image compression is used in several clinical organizations to help address the overhead associated with medical imaging. These methods reduce file size by using a compact representation of the original image. This study aimed to analyze the impact of image compression on the performance of deep learning-based models in classifying mammograms as "malignant"-cases that lead to a cancer diagnosis and treatment-or "normal" and "benign," non-malignant cases that do not require immediate medical intervention. In this retrospective study, 9111 unique mammograms-5672 normal, 1686 benign, and 1754 malignant cases were collected from the National Cancer Center in the Republic of Korea. Image compression was applied to mammograms with compression ratios (CRs) ranging from 15 to 11 K. Convolutional neural networks (CNNs) with three convolutional layers and three fully-connected layers were trained using these images to classify a mammogram as malignant or not malignant across a range of CRs using five-fold cross-validation. Models trained on images with maximum CRs of 5 K had an average area under the receiver operating characteristic curve (AUROC) of 0.87 and area under the precision-recall curve (AUPRC) of 0.75 across the five folds and compression ratios. For images compressed with CRs of 10 K and 11 K, model performance decreased (average 0.79 in AUROC and 0.49 in AUPRC). Upon generating saliency maps that visualize the areas each model views as significant for prediction, models trained on less compressed (CR < = 5 K) images had maps encapsulating a radiologist's label, while models trained on images with higher amounts of compression had maps that missed the ground truth completely. In addition, base ResNet18 models pre-trained on ImageNet and trained using compressed mammograms did not show performance improvements over our CNN model, with AUROC and AUPRC values ranging from 0.77 to 0.87 and 0.52 to 0.71 respectively when trained and tested on images with maximum CRs of 5 K. This paper finds that while training models on images with increased the robustness of the models when tested on compressed data, moderate image compression did not substantially impact the classification performance of DL-based models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Data Compression*
  • Deep Learning*
  • Humans
  • Image Processing, Computer-Assisted*
  • Mammography / classification*
  • Middle Aged
  • Models, Theoretical
  • Neural Networks, Computer
  • ROC Curve