Multi-Scale Context-Guided Deep Network for Automated Lesion Segmentation With Endoscopy Images of Gastrointestinal Tract

Shuai Wang; Yang Cong; Hancan Zhu; Xianyi Chen; Liangqiong Qu; Huijie Fan; Qiang Zhang; Mingxia Liu

doi:10.1109/JBHI.2020.2997760

Multi-Scale Context-Guided Deep Network for Automated Lesion Segmentation With Endoscopy Images of Gastrointestinal Tract

IEEE J Biomed Health Inform. 2021 Feb;25(2):514-525. doi: 10.1109/JBHI.2020.2997760. Epub 2021 Feb 5.

Authors

Shuai Wang, Yang Cong, Hancan Zhu, Xianyi Chen, Liangqiong Qu, Huijie Fan, Qiang Zhang, Mingxia Liu

PMID: 32750912
DOI: 10.1109/JBHI.2020.2997760

Abstract

Accurate lesion segmentation based on endoscopy images is a fundamental task for the automated diagnosis of gastrointestinal tract (GI Tract) diseases. Previous studies usually use hand-crafted features for representing endoscopy images, while feature definition and lesion segmentation are treated as two standalone tasks. Due to the possible heterogeneity between features and segmentation models, these methods often result in sub-optimal performance. Several fully convolutional networks have been recently developed to jointly perform feature learning and model training for GI Tract disease diagnosis. However, they generally ignore local spatial details of endoscopy images, as down-sampling operations (e.g., pooling and convolutional striding) may result in irreversible loss of image spatial information. To this end, we propose a multi-scale context-guided deep network (MCNet) for end-to-end lesion segmentation of endoscopy images in GI Tract, where both global and local contexts are captured as guidance for model training. Specifically, one global subnetwork is designed to extract the global structure and high-level semantic context of each input image. Then we further design two cascaded local subnetworks based on output feature maps of the global subnetwork, aiming to capture both local appearance information and relatively high-level semantic information in a multi-scale manner. Those feature maps learned by three subnetworks are further fused for the subsequent task of lesion segmentation. We have evaluated the proposed MCNet on 1,310 endoscopy images from the public EndoVis-Ab and CVC-ClinicDB datasets for abnormal segmentation and polyp segmentation, respectively. Experimental results demonstrate that MCNet achieves [Formula: see text] and [Formula: see text] mean intersection over union (mIoU) on two datasets, respectively, outperforming several state-of-the-art approaches in automated lesion segmentation with endoscopy images of GI Tract.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Endoscopy
Gastrointestinal Tract / diagnostic imaging
Humans
Image Processing, Computer-Assisted*
Neural Networks, Computer*