DALib: A Curated Repository of Libraries for Data Augmentation in Computer Vision

J Imaging. 2023 Oct 20;9(10):232. doi: 10.3390/jimaging9100232.

Abstract

Data augmentation is a fundamental technique in machine learning that plays a crucial role in expanding the size of training datasets. By applying various transformations or modifications to existing data, data augmentation enhances the generalization and robustness of machine learning models. In recent years, the development of several libraries has simplified the utilization of diverse data augmentation strategies across different tasks. This paper focuses on the exploration of the most widely adopted libraries specifically designed for data augmentation in computer vision tasks. Here, we aim to provide a comprehensive survey of publicly available data augmentation libraries, facilitating practitioners to navigate these resources effectively. Through a curated taxonomy, we present an organized classification of the different approaches employed by these libraries, along with accompanying application examples. By examining the techniques of each library, practitioners can make informed decisions in selecting the most suitable augmentation techniques for their computer vision projects. To ensure the accessibility of this valuable information, a dedicated public website named DALib has been created. This website serves as a centralized repository where the taxonomy, methods, and examples associated with the surveyed data augmentation libraries can be explored. By offering this comprehensive resource, we aim to empower practitioners and contribute to the advancement of computer vision research and applications through effective utilization of data augmentation techniques.

Keywords: computer vision; data augmentation; deep learning; libraries.

Publication types

  • Review

Grants and funding

This work was partially supported by the National Recovery and Resilience Plan (NRRP), Mission 4 Component 2 Investment 1.3—Call for tender No. 341 of 15 March 2022 of the Italian Ministry of University and Research funded by the European Union—NextGenerationEU; Project code PE00000003, Concession Decree No. 1550 of 11 October 2022 adopted by the Italian Ministry of University and Research, CUP D93C22000890001, Project title “ON Foods—Research and innovation network on food and nutrition Sustainability, Safety and Security—Working ON Foods”. This work was partially supported by the MUR under the grant “Dipartimenti di Eccellenza 2023–2027” of the Department of Informatics, Systems and Communication of the University of Milano-Bicocca, Italy.