Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data

JAMIA Open. 2023 Aug 8;6(3):ooad062. doi: 10.1093/jamiaopen/ooad062. eCollection 2023 Oct.

Abstract

Objective: Patient data repositories often assemble medication data from multiple sources, necessitating standardization prior to analysis. We implemented and evaluated a medication standardization procedure for use with a wide range of pharmacy data inputs across all drug categories, which supports research queries at multiple levels of granularity.

Methods: The GEMINI-RxNorm system automates the use of multiple RxNorm tools in tandem with other datasets to identify drug concepts from pharmacy orders. GEMINI-RxNorm was used to process 2 090 155 pharmacy orders from 245 258 hospitalizations between 2010 and 2017 at 7 hospitals in Ontario, Canada. The GEMINI-RxNorm system matches drug-identifying information from pharmacy data (including free-text fields) to RxNorm concept identifiers. A user interface allows researchers to search for drug terms and returns the relevant original pharmacy data through the matched RxNorm concepts. Users can then manually validate the predicted matches and discard false positives. We designed the system to maximize recall (sensitivity) and enable excellent precision (positive predictive value) with efficient manual validation. We compared the performance of this system to manual coding (by a physician and pharmacist) of 13 medication classes.

Results: Manual coding was performed for 1 948 817 pharmacy orders and GEMINI-RxNorm successfully returned 1 941 389 (99.6%) orders. Recall was greater than 0.985 in all 13 drug classes, and the F1-score and precision remained above 0.90 in all drug classes, facilitating efficient manual review to achieve 100% precision. GEMINI-RxNorm saved time substantially compared with manual standardization, reducing the time taken to review a pharmacy order row from an estimated 30 to 5 s and reducing the number of rows needed to be reviewed by up to 99.99%.

Discussion and conclusion: GEMINI-RxNorm presents a novel combination of RxNorm tools and other datasets to enable accurate, efficient, flexible, and scalable standardization of pharmacy data. By facilitating efficient manual validation, the GEMINI-RxNorm system can allow researchers to achieve near-perfect accuracy in medication data standardization.

Keywords: RxNorm; data; medication; pharmacy; standardization.