Interpretable deep learning approach for oral cancer classification using guided attention inference network

Kevin Chew Figueroa; Bofan Song; Sumsum Sunny; Shaobai Li; Keerthi Gurushanth; Pramila Mendonca; Nirza Mukhia; Sanjana Patrick; Shubha Gurudath; Subhashini Raghavan; Tsusennaro Imchen; Shirley T Leivon; Trupti Kolur; Vivek Shetty; Vidya Bushan; Rohan Ramesh; Vijay Pillai; Petra Wilder-Smith; Alben Sigamani; Amritha Suresh; Moni Abraham Kuriakose; Praveen Birur; Rongguang Liang

doi:10.1117/1.JBO.27.1.015001

Interpretable deep learning approach for oral cancer classification using guided attention inference network

J Biomed Opt. 2022 Jan;27(1):015001. doi: 10.1117/1.JBO.27.1.015001.

Authors

Kevin Chew Figueroa¹, Bofan Song¹, Sumsum Sunny², Shaobai Li¹, Keerthi Gurushanth³, Pramila Mendonca⁴, Nirza Mukhia³, Sanjana Patrick⁵, Shubha Gurudath³, Subhashini Raghavan³, Tsusennaro Imchen⁶, Shirley T Leivon⁶, Trupti Kolur⁴, Vivek Shetty⁴, Vidya Bushan⁴, Rohan Ramesh⁶, Vijay Pillai⁴, Petra Wilder-Smith⁷, Alben Sigamani⁴, Amritha Suresh^{2

4}, Moni Abraham Kuriakose^{2

4

8}, Praveen Birur^{3

5}, Rongguang Liang¹

Affiliations

¹ The University of Arizona, Wyant College of Optical Sciences, Tucson, Arizona, United States.
² Mazumdar Shaw Medical Centre, Bangalore, Karnataka, India.
³ KLE Society Institute of Dental Sciences, Bangalore, Karnataka, India.
⁴ Mazumdar Shaw Medical Foundation, Bangalore, Karnataka, India.
⁵ Biocon Foundation, Bangalore, Karnataka, India.
⁶ Christian Institute of Health Sciences and Research, Dimapur, Nagaland, India.
⁷ University of California, Irvine, Beckman Laser Institute & Medical Clinic, Irvine, California, United States.
⁸ Cochin Cancer Research Center, Kochi, Kerala, India.

Abstract

Significance: Convolutional neural networks (CNNs) show the potential for automated classification of different cancer lesions. However, their lack of interpretability and explainability makes CNNs less than understandable. Furthermore, CNNs may incorrectly concentrate on other areas surrounding the salient object, rather than the network's attention focusing directly on the object to be recognized, as the network has no incentive to focus solely on the correct subjects to be detected. This inhibits the reliability of CNNs, especially for biomedical applications.

Aim: Develop a deep learning training approach that could provide understandability to its predictions and directly guide the network to concentrate its attention and accurately delineate cancerous regions of the image.

Approach: We utilized Selvaraju et al.'s gradient-weighted class activation mapping to inject interpretability and explainability into CNNs. We adopted a two-stage training process with data augmentation techniques and Li et al.'s guided attention inference network (GAIN) to train images captured using our customized mobile oral screening devices. The GAIN architecture consists of three streams of network training: classification stream, attention mining stream, and bounding box stream. By adopting the GAIN training architecture, we jointly optimized the classification and segmentation accuracy of our CNN by treating these attention maps as reliable priors to develop attention maps with more complete and accurate segmentation.

Results: The network's attention map will help us to actively understand what the network is focusing on and looking at during its decision-making process. The results also show that the proposed method could guide the trained neural network to highlight and focus its attention on the correct lesion areas in the images when making a decision, rather than focusing its attention on relevant yet incorrect regions.

Conclusions: We demonstrate the effectiveness of our approach for more interpretable and reliable oral potentially malignant lesion and malignant lesion classification.

Keywords: guided attention inference network; interpretable deep learning; oral cancer.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Attention
Deep Learning*
Humans
Mouth Neoplasms* / diagnostic imaging
Neural Networks, Computer
Reproducibility of Results

Abstract

Publication types

MeSH terms

Grants and funding