Comparing automated vs. manual data collection for COVID-specific medications from electronic health records

Int J Med Inform. 2022 Jan:157:104622. doi: 10.1016/j.ijmedinf.2021.104622. Epub 2021 Oct 21.

Abstract

Introduction: Data extraction from electronic health record (EHR) systems occurs through manual abstraction, automated extraction, or a combination of both. While each method has its strengths and weaknesses, both are necessary for retrospective observational research as well as sudden clinical events, like the COVID-19 pandemic. Assessing the strengths, weaknesses, and potentials of these methods is important to continue to understand optimal approaches to extracting clinical data. We set out to assess automated and manual techniques for collecting medication use data in patients with COVID-19 to inform future observational studies that extract data from the electronic health record (EHR).

Materials and methods: For 4,123 COVID-positive patients hospitalized and/or seen in the emergency department at an academic medical center between 03/03/2020 and 05/15/2020, we compared medication use data of 25 medications or drug classes collected through manual abstraction and automated extraction from the EHR. Quantitatively, we assessed concordance using Cohen's kappa to measure interrater reliability, and qualitatively, we audited observed discrepancies to determine causes of inconsistencies.

Results: For the 16 inpatient medications, 11 (69%) demonstrated moderate or better agreement; 7 of those demonstrated strong or almost perfect agreement. For 9 outpatient medications, 3 (33%) demonstrated moderate agreement, but none achieved strong or almost perfect agreement. We audited 12% of all discrepancies (716/5,790) and, in those audited, observed three principal categories of error: human error in manual abstraction (26%), errors in the extract-transform-load (ETL) or mapping of the automated extraction (41%), and abstraction-query mismatch (33%).

Conclusion: Our findings suggest many inpatient medications can be collected reliably through automated extraction, especially when abstraction instructions are designed with data architecture in mind. We discuss quality issues, concerns, and improvements for institutions to consider when crafting an approach. During crises, institutions must decide how to allocate limited resources. We show that automated extraction of medications is feasible and make recommendations on how to improve future iterations.

Keywords: COVID-19; Chart review; Data quality; Electronic health record; Research data repositories.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • COVID-19*
  • Data Collection
  • Electronic Health Records
  • Humans
  • Pandemics
  • Pharmaceutical Preparations*
  • Reproducibility of Results
  • Retrospective Studies
  • SARS-CoV-2

Substances

  • Pharmaceutical Preparations