Sen-2 LULC: Land use land cover dataset for deep learning approaches

Data Brief. 2023 Oct 24:51:109724. doi: 10.1016/j.dib.2023.109724. eCollection 2023 Dec.

Abstract

Land Use Land Cover (LULC) classification is pivotal to sustainable environment and natural resource management. It is critical in planning, monitoring, and management programs at various local and national levels. Monitoring changes in LULC patterns over time is crucial for understanding evolving landscapes. Traditionally, LULC classification has been achieved through satellite data by remote sensing, geographic information system (GIS) techniques, machine learning classifiers, and deep learning models. Semantic segmentation, a technique for assigning land cover classes to individual pixels in an image, is commonly employed for LULC mapping. In recent years, the deep learning revolution, particularly Convolutional Neural Networks (CNNs), has reshaped the field of computer vision and LULC classification. Deep architectures have consistently outperformed traditional methods, offering greater accuracy and efficiency. However, the availability of high-quality datasets has been a limiting factor. Bridging the gap between modern computer vision and remote sensing data analysis can revolutionize our understanding of the environment and drive breakthroughs in urban planning and ecosystem change research. The "Sen-2 LULC Dataset" has been created to facilitate this convergence. This dataset comprises of 213,761 pre-processed 10 m resolution images representing seven LULC classes. These classes encompass water bodies, dense forests, sparse forests, barren land, built-up areas, agricultural land, and fallow land. Importantly, each image may contain multiple coexisting land use and land cover classes, mirroring the real-world complexity of landscapes. The dataset is derived from Sentinel-2 satellite imagery sourced from the Copernicus Open Access Hub (https://scihub.copernicus.eu/) platform. It includes spectral bands B4, B3, and B2, corresponding to red, green, and blue (RGB) channels, and offers a spectral resolution of 10 m. The dataset also provides an equal number of mask images. Structured into six folders, the dataset offers training, testing, and validation sets for images and masks. Researchers across various domains can leverage this resource to advance LULC classification in the context of the Indian region. Additionally, it catalyzes fostering collaboration between remote sensing and computer vision communities, enabling novel insights into environmental dynamics and urban planning challenges.

Keywords: Convolution neural network; Deep learning; Image classification; Land Use Land Cover (LULC); Remote sensing; Satellite imagery; Sentinel-2.