CatLC: Catalonia Multiresolution Land Cover Dataset

Sci Data. 2022 Sep 8;9(1):554. doi: 10.1038/s41597-022-01674-y.

Abstract

The availability of large annotated image datasets represented one of the tipping points in the progress of object recognition in the realm of natural images, but other important visual spaces are still lacking this asset. In the case of remote sensing, only a few richly annotated datasets covering small areas are available. In this paper, we present the Catalonia Multiresolution Land Cover Dataset (CatLC), a remote sensing dataset corresponding to a mid-size geographical area which has been carefully annotated with a large variety of land cover classes. The dataset includes pre-processed images from the Cartographic and Geological Institute of Catalonia (ICGC) ( https://www.icgc.cat/en/Downloads ) and the European Space Agency (ESA) ( https://scihub.copernicus.eu ) catalogs, captured from both aircraft and satellites. Detailed topographic layers inferred from other sensors are also included. CatLC is a multiresolution, multimodal, multitemporal dataset, that can be readily used by the machine learning community to explore new classification techniques for land cover mapping in different scenarios such as area estimation in forest inventories, hydrologic studies involving microclimatic variables or geologic hazards identification and assessment. Moreover, remote sensing data present some specific characteristics that are not shared by natural images and that have been seldom explored. In this vein, CatLC dataset aims to engage with computer vision experts interested in remote sensing and also stimulate new research and development in the field of machine learning.