Challenges in mapping European rare disease databases, relevant for ML-based screening technologies in terms of organizational, FAIR and legal principles: scoping review

Front Public Health. 2023 Sep 15:11:1214766. doi: 10.3389/fpubh.2023.1214766. eCollection 2023.

Abstract

Background: Given the increased availability of data sources such as hospital information systems, electronic health records, and health-related registries, a novel approach is required to develop artificial intelligence-based decision support that can assist clinicians in their diagnostic decision-making and shorten rare disease patients' diagnostic odyssey. The aim is to identify key challenges in the process of mapping European rare disease databases, relevant to ML-based screening technologies in terms of organizational, FAIR and legal principles.

Methods: A scoping review was conducted based on the PRISMA-ScR checklist. The primary article search was conducted in three electronic databases (MEDLINE/Pubmed, Scopus, and Web of Science) and a secondary search was performed in Google scholar and on the organizations' websites. Each step of this review was carried out independently by two researchers. A charting form for relevant study analysis was developed and used to categorize data and identify data items in three domains - organizational, FAIR and legal.

Results: At the end of the screening process, 73 studies were eligible for review based on inclusion and exclusion criteria with more than 60% (n = 46) of the research published in the last 5 years and originated only from EU/EEA countries. Over the ten-year period (2013-2022), there is a clear cycling trend in the publications, with a peak of challenges reporting every four years. Within this trend, the following dynamic was identified: except for 2016, organizational challenges dominated the articles published up to 2018; legal challenges were the most frequently discussed topic from 2018 to 2022. The following distribution of the data items by domains was observed - (1) organizational (n = 36): data accessibility and sharing (20.2%); long-term sustainability (18.2%); governance, planning and design (17.2%); lack of harmonization and standardization (17.2%); quality of data collection (16.2%); and privacy risks and small sample size (11.1%); (2) FAIR (n = 15): findable (17.9%); accessible sustainability (25.0%); interoperable (39.3%); and reusable (17.9%); and (3) legal (n = 33): data protection by all means (34.4%); data management and ownership (22.9%); research under GDPR and member state law (20.8%); trust and transparency (13.5%); and digitalization of health (8.3%). We observed a specific pattern repeated in all domains during the process of data charting and data item identification - in addition to the outlined challenges, good practices, guidelines, and recommendations were also discussed. The proportion of publications addressing only good practices, guidelines, and recommendations for overcoming challenges when mapping RD databases in at least one domain was calculated to be 47.9% (n = 35).

Conclusion: Despite the opportunities provided by innovation - automation, electronic health records, hospital-based information systems, biobanks, rare disease registries and European Reference Networks - the results of the current scoping review demonstrate a diversity of the challenges that must still be addressed, with immediate actions on ensuring better governance of rare disease registries, implementing FAIR principles, and enhancing the EU legal framework.

Keywords: European Reference Networks (ERNs); artificial intelligence; electronic health records; issues; limitations; machine learning; rare disease registry.

Publication types

  • Systematic Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Data Management*
  • Humans
  • Privacy
  • Rare Diseases*
  • Registries