Automatic Extraction of Water and Shadow from SAR Images Based on a Multi-Resolution Dense Encoder and Decoder Network

Peng Zhang; Lifu Chen; Zhenhong Li; Jin Xing; Xuemin Xing; Zhihui Yuan

doi:10.3390/s19163576

Automatic Extraction of Water and Shadow from SAR Images Based on a Multi-Resolution Dense Encoder and Decoder Network

Sensors (Basel). 2019 Aug 16;19(16):3576. doi: 10.3390/s19163576.

Authors

Peng Zhang^{1

2}, Lifu Chen^{3

4

5}, Zhenhong Li^{6

7}, Jin Xing⁶, Xuemin Xing^{2

8}, Zhihui Yuan^{1

2}

Affiliations

¹ School of Electrical and Information Engineering, Changsha University of Science & Technology, Changsha 410114, China.
² Laboratory of Radar Remote Sensing Applications, Changsha University of Science & Technology, Changsha 410014, China.
³ School of Electrical and Information Engineering, Changsha University of Science & Technology, Changsha 410114, China. Lifu.Chen@newcastle.ac.uk.
⁴ Laboratory of Radar Remote Sensing Applications, Changsha University of Science & Technology, Changsha 410014, China. Lifu.Chen@newcastle.ac.uk.
⁵ School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK. Lifu.Chen@newcastle.ac.uk.
⁶ School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.
⁷ College of Geological Engineering and Geomatics, Chang'an University, Xi'an 710054, China.
⁸ School of Traffic & Transportation Engineering, Changsha University of Science & Technology, Changsha 410114, China.

Abstract

The water and shadow areas in SAR images contain rich information for various applications, which cannot be extracted automatically and precisely at present. To handle this problem, a new framework called Multi-Resolution Dense Encoder and Decoder (MRDED) network is proposed, which integrates Convolutional Neural Network (CNN), Residual Network (ResNet), Dense Convolutional Network (DenseNet), Global Convolutional Network (GCN), and Convolutional Long Short-Term Memory (ConvLSTM). MRDED contains three parts: the Gray Level Gradient Co-occurrence Matrix (GLGCM), the Encoder network, and the Decoder network. GLGCM is used to extract low-level features, which are further processed by the Encoder. The Encoder network employs ResNet to extract features at different resolutions. There are two components of the Decoder network, namely, the Multi-level Features Extraction and Fusion (MFEF) and Score maps Fusion (SF). We implement two versions of MFEF, named MFEF1 and MFEF2, which generate separate score maps. The difference between them lies in that the Chained Residual Pooling (CRP) module is utilized in MFEF2, while ConvLSTM is adopted in MFEF1 to form the Improved Chained Residual Pooling (ICRP) module as the replacement. The two separate score maps generated by MFEF1 and MFEF2 are fused with different weights to produce the fused score map, which is further handled by the Softmax function to generate the final extraction results for water and shadow areas. To evaluate the proposed framework, MRDED is trained and tested with large SAR images. To further assess the classification performance, a total of eight different classification frameworks are compared with our proposed framework. MRDED outperformed by reaching 80.12% in Pixel Accuracy (PA) and 73.88% in Intersection of Union (IoU) for water, 88% in PA and 77.11% in IoU for shadow, and 95.16% in PA and 90.49% in IoU for background classification, respectively.

Keywords: CONVOLUTION LONG SHORT-TERM MEMORY (ConvLSTM); classification; convolutional neural network (CNN); deep learning; dense convolutional network (DenseNet); global convolutional network (GCN); shadow extraction; synthetic aperture radar (SAR); water extraction.

Abstract

Grants and funding