Multi-label recognition of cancer-related lesions with clinical priors on white-light endoscopy

Tao Yu; Ne Lin; Xingwei Zhong; Xiaoyan Zhang; Xinsen Zhang; Yihe Chen; Jiquan Liu; Weiling Hu; Huilong Duan; Jianmin Si

doi:10.1016/j.compbiomed.2022.105255

Multi-label recognition of cancer-related lesions with clinical priors on white-light endoscopy

Comput Biol Med. 2022 Apr:143:105255. doi: 10.1016/j.compbiomed.2022.105255. Epub 2022 Jan 25.

Authors

Tao Yu¹, Ne Lin², Xingwei Zhong², Xiaoyan Zhang¹, Xinsen Zhang¹, Yihe Chen¹, Jiquan Liu³, Weiling Hu⁴, Huilong Duan¹, Jianmin Si⁴

Affiliations

¹ Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China.
² Department of Gastroenterology, Sir Run Run Shaw Hospital, Medical School, Zhejiang University, Hangzhou, China.
³ Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China. Electronic address: liujq@zju.edu.cn.
⁴ Department of Gastroenterology, Sir Run Run Shaw Hospital, Medical School, Zhejiang University, Hangzhou, China; Institute of Gastroenterology, Zhejiang University, Hangzhou, China.

PMID: 35151153
DOI: 10.1016/j.compbiomed.2022.105255

Abstract

Deep learning-based computer-aided diagnosis techniques have demonstrated encouraging performance in endoscopic lesion identification and detection, and have reduced the rate of missed and false detections of disease during endoscopy. However, the interpretability of the model-based results has not been adequately addressed by existing methods. This phenomenon is directly manifested by a significant bias in the representation of feature localization. Good recognition models experience severe feature localization errors, particularly for lesions with subtle morphological features, and such unsatisfactory performance hinders the clinical deployment of models. To effectively alleviate this problem, we proposed a solution to optimize the localization bias in feature representations of cancer-related recognition models that is difficult to accurately label and identify in clinical practice. Optimization was performed in the training phase of the model through the proposed data augmentation method and auxiliary loss function based on clinical priors. The data augmentation method, called partial jigsaw, can "break" the spatial structure of lesion-independent image blocks and enrich the data feature space to decouple the interference of background features on the space and focus on fine-grained lesion features. The annotation-based auxiliary loss function used class activation maps for sample distribution correction and led the model to present localization representation converging on the gold standard annotation of visualization maps. The results show that with the improvement of our method, the precision of model recognition reached an average of 92.79%, an F1-score of 92.61%, and accuracy of 95.56% based on a dataset constructed from 23 hospitals. In addition, we quantified the evaluation representation of visualization feature maps. The improved model yielded significant offset correction results for visualized feature maps compared with the baseline model. The average visualization-weighted positive coverage improved from 51.85% to 83.76%. The proposed approach did not change the deployment capability and inference speed of the original model and can be incorporated into any state-of-the-art neural network. It also shows the potential to provide more accurate localization inference results and assist in clinical examinations during endoscopies.

Keywords: Class activation maps; Computer aided diagnosis; Endoscopy; Multi-label classification; Neural networks.