A Deep Learning Pipeline for Grade Groups Classification Using Digitized Prostate Biopsy Specimens

Kamal Hammouda; Fahmi Khalifa; Moumen El-Melegy; Mohamed Ghazal; Hanan E Darwish; Mohamed Abou El-Ghar; Ayman El-Baz

doi:10.3390/s21206708

A Deep Learning Pipeline for Grade Groups Classification Using Digitized Prostate Biopsy Specimens

Sensors (Basel). 2021 Oct 9;21(20):6708. doi: 10.3390/s21206708.

Authors

Kamal Hammouda¹, Fahmi Khalifa¹, Moumen El-Melegy², Mohamed Ghazal³, Hanan E Darwish⁴, Mohamed Abou El-Ghar⁵, Ayman El-Baz¹

Affiliations

¹ BioImaging Laboratory, Bioengineering Department, University of Louisville, Louisville, KY 40292, USA.
² Department of Electrical Engineering, Assiut University, Assiut 71515, Egypt.
³ Electrical and Computer Engineering Department, Abu Dhabi University, Abu Dhabi 59911, United Arab Emirates.
⁴ Mathematics Department, Faculty of Science, Mansoura University, Mansoura 35516, Egypt.
⁵ Radiology Department, Urology and Nephrology Center, Mansoura University, Mansoura 35516, Egypt.

Abstract

Prostate cancer is a significant cause of morbidity and mortality in the USA. In this paper, we develop a computer-aided diagnostic (CAD) system for automated grade groups (GG) classification using digitized prostate biopsy specimens (PBSs). Our CAD system aims to firstly classify the Gleason pattern (GP), and then identifies the Gleason score (GS) and GG. The GP classification pipeline is based on a pyramidal deep learning system that utilizes three convolution neural networks (CNN) to produce both patch- and pixel-wise classifications. The analysis starts with sequential preprocessing steps that include a histogram equalization step to adjust intensity values, followed by a PBSs' edge enhancement. The digitized PBSs are then divided into overlapping patches with the three sizes: 100 × 100 (CNNS), 150 × 150 (CNNM), and 200 × 200 (CNNL), pixels, and 75% overlap. Those three sizes of patches represent the three pyramidal levels. This pyramidal technique allows us to extract rich information, such as that the larger patches give more global information, while the small patches provide local details. After that, the patch-wise technique assigns each overlapped patch a label as GP categories (1 to 5). Then, the majority voting is the core approach for getting the pixel-wise classification that is used to get a single label for each overlapped pixel. The results after applying those techniques are three images of the same size as the original, and each pixel has a single label. We utilized the majority voting technique again on those three images to obtain only one. The proposed framework is trained, validated, and tested on 608 whole slide images (WSIs) of the digitized PBSs. The overall diagnostic accuracy is evaluated using several metrics: precision, recall, F1-score, accuracy, macro-averaged, and weighted-averaged. The (CNNL) has the best accuracy results for patch classification among the three CNNs, and its classification accuracy is 0.76. The macro-averaged and weighted-average metrics are found to be around 0.70-0.77. For GG, our CAD results are about 80% for precision, and between 60% to 80% for recall and F1-score, respectively. Also, it is around 94% for accuracy and NPV. To highlight our CAD systems' results, we used the standard ResNet50 and VGG-16 to compare our CNN's patch-wise classification results. As well, we compared the GG's results with that of the previous work.

Keywords: CAD system; classification; deep learning; grade groups; prostate cancer.

MeSH terms

Biopsy
Deep Learning*
Humans
Male
Neoplasm Grading
Neural Networks, Computer
Prostate* / diagnostic imaging

Grants and funding

Academy of Scientific Research and Technology/National Program for Research & Innovation in Health and Biomedical Sciences Academy of Scientific Research and Technology