Sharing Biomedical Data: Strengthening AI Development in Healthcare

Tania Pereira; Joana Morgado; Francisco Silva; Michele M Pelter; Vasco Rosa Dias; Rita Barros; Cláudia Freitas; Eduardo Negrão; Beatriz Flor de Lima; Miguel Correia da Silva; António J Madureira; Isabel Ramos; Venceslau Hespanhol; José Luis Costa; António Cunha; Hélder P Oliveira

doi:10.3390/healthcare9070827

Sharing Biomedical Data: Strengthening AI Development in Healthcare

Healthcare (Basel). 2021 Jun 30;9(7):827. doi: 10.3390/healthcare9070827.

Authors

Tania Pereira¹, Joana Morgado^{1

2}, Francisco Silva¹, Michele M Pelter³, Vasco Rosa Dias¹, Rita Barros¹, Cláudia Freitas^{4

5}, Eduardo Negrão⁴, Beatriz Flor de Lima⁴, Miguel Correia da Silva⁴, António J Madureira^{4

5}, Isabel Ramos^{4

5}, Venceslau Hespanhol^{4

5}, José Luis Costa^{5

6

7}, António Cunha^{1

8}, Hélder P Oliveira^{1

2}

Affiliations

¹ INESC TEC-Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal.
² FCUP-Faculty of Science, University of Porto, 4169-007 Porto, Portugal.
³ Department of Physiological Nursing, School of Nursing, University of California, San Francisco, CA 94143, USA.
⁴ CHUSJ-Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal.
⁵ FMUP-Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal.
⁶ i3S-Institute for Research and Innovation in Health of the University of Porto, 4200-135 Porto, Portugal.
⁷ IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, 4200-135 Porto, Portugal.
⁸ UTAD-University of Trás-os-Montes and Alto Douro, 5001-801 Vila Real, Portugal.

Abstract

Artificial intelligence (AI)-based solutions have revolutionized our world, using extensive datasets and computational resources to create automatic tools for complex tasks that, until now, have been performed by humans. Massive data is a fundamental aspect of the most powerful AI-based algorithms. However, for AI-based healthcare solutions, there are several socioeconomic, technical/infrastructural, and most importantly, legal restrictions, which limit the large collection and access of biomedical data, especially medical imaging. To overcome this important limitation, several alternative solutions have been suggested, including transfer learning approaches, generation of artificial data, adoption of blockchain technology, and creation of an infrastructure composed of anonymous and abstract data. However, none of these strategies is currently able to completely solve this challenge. The need to build large datasets that can be used to develop healthcare solutions deserves special attention from the scientific community, clinicians, all the healthcare players, engineers, ethicists, legislators, and society in general. This paper offers an overview of the data limitation in medical predictive models; its impact on the development of healthcare solutions; benefits and barriers of sharing data; and finally, suggests future directions to overcome data limitations in the medical field and enable AI to enhance healthcare. This perspective is dedicated to the technical requirements of the learning models, and it explains the limitation that comes from poor and small datasets in the medical domain and the technical options that try or can solve the problem related to the lack of massive healthcare data.

Keywords: AI-based healthcare solutions; biomedical data; massive databases; medical imaging; shared data.