Labeled dataset for training despeckling filters for SAR imagery

Rubén Darío Vásquez-Salazar; Ahmed Alejandro Cardona-Mesa; Luis Gómez; Carlos M Travieso-González; Andrés F Garavito-González; Esteban Vásquez-Cano

doi:10.1016/j.dib.2024.110065

Labeled dataset for training despeckling filters for SAR imagery

Data Brief. 2024 Jan 15:53:110065. doi: 10.1016/j.dib.2024.110065. eCollection 2024 Apr.

Authors

Rubén Darío Vásquez-Salazar¹, Ahmed Alejandro Cardona-Mesa², Luis Gómez³, Carlos M Travieso-González⁴, Andrés F Garavito-González¹, Esteban Vásquez-Cano¹

Affiliations

¹ Faculty of Engineering, Politécnico Colombiano Jaime Isaza Cadavid, Medellín, 48th Av, 7-151, Colombia.
² Faculty of Engineering, Institución Universitaria Digital de Antioquia, Medellín, 55th Av, 42-90, Colombia.
³ Electronic Engineering and Automatic Department, IUCES, Universidad de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain.
⁴ Signals and Communications Department, IDeTIC, Universidad de Las Palmas de Gran Canaria, Spain.

Abstract

When training Artificial Intelligence and Deep Learning models, especially by using Supervised Learning techniques, a labeled dataset is required to have an input with data and its corresponding labeled output data. In the case of images, for classification, segmentation, or other processing tasks, a pair of images is required in the same sense, one image as an input (the noisy image) and the desired (the denoised image) one as an output. For SAR despeckling applications, the common approach is to have a set of optical images that then are corrupted with synthetic noise, since there is no ground truth available. The corrupted image is considered the input and the optical one is the noiseless one (ground truth). In this paper, we provide a dataset based on actual SAR images. The ground truth was obtained from SAR images of Sentinel 1 of the same region in different instants of time and then they were processed and merged into one single image that serves as the output of the dataset. Every SAR image (noisy and ground truth) was split into 1600 images of 512 × 512 pixels, so a total of 3200 images were obtained. The dataset was also split into 3000 for training and 200 for validation, all of them available in four labeled folders.

Keywords: Image denoising; Labeled dataset; Speckle; Supervised learning; Synthetic Aperture Radar (SAR).