Making EHRs Trustable: A Quality Analysis of EHR-Derived Datasets for COVID-19 Research

Stud Health Technol Inform. 2022 May 25:294:164-168. doi: 10.3233/SHTI220430.

Abstract

One approach to verifying the quality of research data obtained from EHRs is auditing how complete and correct the data are in comparison with those collected by manual and controlled methods. This study analyzed data quality of an EHR-derived dataset for COVID-19 research, obtained during the pandemic at Hospital Universitario 12 de Octubre. Data were extracted from EHRs and a manually collected research database, and then transformed into the ISARIC-WHO COVID-19 CRF model. Subsequently, a data analysis was performed, comparing both sources through this convergence model. More concepts and records were obtained from EHRs, and PPV (95% CI) was above 85% in most sections. In future studies, a more detailed analysis of data quality will be carried out.

Keywords: COVID-19; Completeness; Correctness; Data Quality; Electronic Health Records; ISARIC-WHO; Real World Data; Semantics; Standards.

MeSH terms

  • COVID-19*
  • Data Accuracy
  • Databases, Factual
  • Electronic Health Records
  • Humans
  • Pandemics