Cov-caldas: A new COVID-19 chest X-Ray dataset from state of Caldas-Colombia

Sci Data. 2022 Dec 7;9(1):757. doi: 10.1038/s41597-022-01576-z.

Abstract

The emergence of COVID-19 as a global pandemic forced researchers worldwide in various disciplines to investigate and propose efficient strategies and/or technologies to prevent COVID-19 from further spreading. One of the main challenges to be overcome is the fast and efficient detection of COVID-19 using deep learning approaches and medical images such as Chest Computed Tomography (CT) and Chest X-ray images. In order to contribute to this challenge, a new dataset was collected in collaboration with "S.E.S Hospital Universitario de Caldas" ( https://hospitaldecaldas.com/ ) from Colombia and organized following the Medical Imaging Data Structure (MIDS) format. The dataset contains 7,307 chest X-ray images divided into 3,077 and 4,230 COVID-19 positive and negative images. Images were subjected to a selection and anonymization process to allow the scientific community to use them freely. Finally, different convolutional neural networks were used to perform technical validation. This dataset contributes to the scientific community by tackling significant limitations regarding data quality and availability for the detection of COVID-19.

MeSH terms

  • COVID-19* / diagnostic imaging
  • Colombia
  • Humans
  • X-Rays