Prediction of Breast Cancer Recurrence Using a Deep Convolutional Neural Network Without Region-of-Interest Labeling

Front Oncol. 2021 Oct 21:11:734015. doi: 10.3389/fonc.2021.734015. eCollection 2021.

Abstract

Purpose: The present study aimed to assign a risk score for breast cancer recurrence based on pathological whole slide images (WSIs) using a deep learning model.

Methods: A total of 233 WSIs from 138 breast cancer patients were assigned either a low-risk or a high-risk score based on a 70-gene signature. These images were processed into patches of 512x512 pixels by the PyHIST tool and underwent color normalization using the Macenko method. Afterward, out of focus and pixelated patches were removed using the Laplacian algorithm. Finally, the remaining patches (n=294,562) were split into 3 parts for model training (50%), validation (7%) and testing (43%). We used 6 pretrained models for transfer learning and evaluated their performance using accuracy, precision, recall, F1 score, confusion matrix, and AUC. Additionally, to demonstrate the robustness of the final model and its generalization capacity, the testing set was used for model evaluation. Finally, the GRAD-CAM algorithm was used for model visualization.

Results: Six models, namely VGG16, ResNet50, ResNet101, Inception_ResNet, EfficientB5, and Xception, achieved high performance in the validation set with an overall accuracy of 0.84, 0.85, 0.83, 0.84, 0.87, and 0.91, respectively. We selected Xception for assessment of the testing set, and this model achieved an overall accuracy of 0.87 with a patch-wise approach and 0.90 and 1.00 with a patient-wise approach for high-risk and low-risk groups, respectively.

Conclusions: Our study demonstrated the feasibility and high performance of artificial intelligence models trained without region-of-interest labeling for predicting cancer recurrence based on a 70-gene signature risk score.

Keywords: 70-gene signature; deep learning; label-free; pathology; transfer learning; whole slide image.