A multiple-channel and atrous convolution network for ultrasound image segmentation

Lun Zhang; Junhua Zhang; Zonggui Li; Yingchao Song

doi:10.1002/mp.14512

A multiple-channel and atrous convolution network for ultrasound image segmentation

Med Phys. 2020 Dec;47(12):6270-6285. doi: 10.1002/mp.14512. Epub 2020 Oct 18.

Authors

Lun Zhang^{1

2}, Junhua Zhang¹, Zonggui Li¹, Yingchao Song¹

Affiliations

¹ School of Information Science and Engineering, Yunnan University, Kunming, Yunnan, 650091, China.
² Yunnan Vocational Institute of Energy Technology, Qujing, Yunnan, 655001, China.

PMID: 33007105
DOI: 10.1002/mp.14512

Abstract

Purpose: Ultrasound image segmentation is a challenging task due to a low signal-to-noise ratio and poor image quality. Although several approaches based on the convolutional neural network (CNN) have been applied to ultrasound image segmentation, they have weak generalization ability. We propose an end-to-end, multiple-channel and atrous CNN designed to extract a greater amount of semantic information for segmentation of ultrasound images.

Method: A multiple-channel and atrous convolution network is developed, referred to as MA-Net. Similar to U-Net, MA-Net is based on an encoder-decoder architecture and includes five modules: the encoder, atrous convolution, pyramid pooling, decoder, and residual skip pathway modules. In the encoder module, we aim to capture more information with multiple-channel convolution and use large kernel convolution instead of small filters in each convolution operation. In the last layer, atrous convolution and pyramid pooling are used to extract multi-scale features. The architecture of the decoder is similar to that of the encoder module, except that up-sampling is used instead of down-sampling. Furthermore, the residual skip pathway module connects the subnetworks of the encoder and decoder to optimize learning from the deeper layer and improve the accuracy of segmentation. During the learning process, we adopt multi-task learning to enhance segmentation performance. Five types of datasets are used in our experiments. Because the original training data are limited, we apply data augmentation (e.g., horizontal and vertical flipping, random rotations, and random scaling) to our training data. We use the Dice score, precision, recall, Hausdorff distance (HD), average symmetric surface distance (ASD), and root mean square symmetric surface distance (RMSD) as the metrics for segmentation evaluation. Meanwhile, Friedman test was performed as the nonparametric statistical analysis to evaluate the algorithms.

Results: For the datasets of brachia plexus (BP), fetal head, and lymph node segmentations, MA-Net achieved average Dice scores of 0.776, 0.973, and 0.858, respectively; with average precisions of 0.787, 0.968, and 0.854, respectively; average recalls of 0.788, 0.978, and 0.885, respectively; average HDs (mm) of 13.591, 10.924, and 19.245, respectively; average ASDs (mm) of 4.822, 4.152, and 4.312, respectively; and average RMSDs (mm) of 4.979, 4.161, and 4.930, respectively. Compared with U-Net, U-Net++, M-Net, and Dilated U-Net, the average performance of the MA-Net increased by approximately 5.68%, 2.85%, 6.59%, 36.03%, 23.64%, and 31.71% for Dice, precision, recall, HD, ASD, and RMSD, respectively. Moreover, we verified the generalization of MA-Net segmentation to lower grade brain glioma MRI and lung CT images. In addition, the MA-Net achieved the highest mean rank in the Friedman test.

Conclusion: The proposed MA-Net accurately segments ultrasound images with high generalization, and therefore, it offers a useful tool for diagnostic application in ultrasound images.

Keywords: atrous convolution; multiple-channel convolution; pyramid pooling; ultrasound image.

MeSH terms

Algorithms
Image Processing, Computer-Assisted*
Magnetic Resonance Imaging
Neural Networks, Computer*
Tomography, X-Ray Computed

Abstract

MeSH terms

Grants and funding