Fitness of real-world data for clinical trial data collection: Results and lessons from a HARMONY Outcomes ancillary study

Clin Trials. 2022 Dec;19(6):655-664. doi: 10.1177/17407745221114298. Epub 2022 Jul 24.

Abstract

Background: Despite the extensive use of real-world data for retrospective, observational clinical research, our understanding of how real-world data might increase the efficiency of data collection in patient-level randomized clinical trials is largely unknown. The structure of real-world data is inherently heterogeneous, with each source electronic health record and claims database different from the next. Their fitness-for-use as data sources for multisite trials in the United States has not been established.

Methods: For a subset of participants in the HARMONY Outcomes Trial, we obtained electronic health record data from recruiting sites or Medicare claims data from the Centers for Medicare & Medicaid Services. For baseline characteristics and follow-up events, we assessed the level of agreement between these real-world data and data documented in the trial database.

Results: Real-world data-derived demographic information tended to agree with trial-reported demographic information, although real-world data were less accurate in identifying medical history. The ability of real-world data to identify baseline medication usage differed by real-world data source, with claims data demonstrating substantially better performance than electronic health record data. The limited number of lab results in the collected electronic health record data matched closely with values in the trial database. There were not enough follow-up events in the ancillary study population to draw meaningful conclusions about the performance of real-world data for identification of events. Based on the conduct of this ancillary study, the challenges and opportunities of using real-world data within clinical trials are discussed.

Conclusion: Based on a subset of participants from the HARMONY Outcomes Trial, our results suggest that electronic health record or claims data, as currently available, are unlikely to be a complete substitute for trial data collection of medical history or baseline lab results, but that Medicare claims were able to identify most medications. The limited size of the study population prevents us from drawing strong conclusions based on these results, and other studies are clearly needed to confirm or refute these findings.

Keywords: Electronic health record; claims data; common data model.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Aged
  • Data Collection / methods
  • Databases, Factual
  • Electronic Health Records*
  • Humans
  • Medicare*
  • Retrospective Studies
  • United States