Local Data Quality Assessments on EHR-Based Real-World Data for Rare Diseases

Stud Health Technol Inform. 2023 May 18:302:292-296. doi: 10.3233/SHTI230121.

Abstract

The project "Collaboration on Rare Diseases" CORD-MI connects various university hospitals in Germany to collect sufficient harmonized electronic health record (EHR) data for supporting clinical research in the field of rare diseases (RDs). However, the integration and transformation of heterogeneous data into an interoperable standard through Extract-Transform-Load (ETL) processes is a complex task that may influence the data quality (DQ). Local DQ assessments and control processes are needed to ensure and improve the quality of RD data. We therefore aim to investigate the impact of ETL processes on the quality of transformed RD data. Seven DQ indicators for three independent DQ dimensions were evaluated. The resulting reports show the correctness of calculated DQ metrics and detected DQ issues. Our study provides the first comparison results between the DQ of RD data before and after ETL processes. We found that ETL processes are challenging tasks that influence the quality of RD data. We have demonstrated that our methodology is useful and capable of evaluating the quality of real-world data stored in different formats and structures. Our methodology can therefore be used to improve the quality of RD documentation and to support clinical research.

Keywords: Data quality; ETL; HL7 FHIR; healthcare standards; rare disease.

MeSH terms

  • Data Accuracy*
  • Documentation
  • Electronic Health Records*
  • Hospitals, University
  • Humans
  • Rare Diseases