Probabilistic Linkage of Randomized Controlled Trial Data to Administrative Claims: A Case Study of Patients from Baricitinib Clinical Trials

Rheumatol Ther. 2021 Jun;8(2):793-802. doi: 10.1007/s40744-021-00302-2. Epub 2021 Apr 2.

Abstract

Introduction: The aim of this work is to assess the feasibility of probabilistically linking randomized controlled trial (RCT) data to claims data in a real-world setting to inform future rheumatoid arthritis (RA) research.

Methods: This retrospective cohort study utilized IQVIA's Patient Centric Medical Claims (Dx) Database, IQVIA's Longitudinal Prescription Claims (LRx) Database, and Lilly's baricitinib RCT data from a sample of patients that consented to the linkage of their de-identified insurance claims to their de-identified RCT data. Patients were initially matched on age, gender, and three-digit ZIP code of the provider and further matched according to a point scoring system using additional clinical variables.

Results: A total of 245 patients from 49 US clinical trial sites were eligible for the study and 78 (31.8%) of these patients consented to participate. Of the 78 consented patients, 69 (88%) were successfully matched on age, gender, and three-digit ZIP code of the provider. Of the 69 patients successfully matched on age, gender, and three-digit ZIP code of the provider, 44 (63.8%) had at least one sufficient match using the point scoring system. Of these 44, 23 (52.3%) patients matched at a ratio of one RCT patient to one Dx/LRx patient, 11 (25.0%) at a ratio of 1:2, 7 (15.9%) at a ratio of 1:3 and three (6.8%) at a ratio of 1:4 or greater. To further improve match ratios, a variable hierarchy was applied to the 18 RCT patients with 2-3 matches. Overall, 38 of the 78 (48.7%) consented RCT patients were successfully matched 1:1 to claims database patients.

Conclusions: This probabilistic linkage methodology demonstrates the feasibility, at a moderate linkage rate, of linking patients from RCTs to real-world data, which can provide a means to assess additional information not usually collected within or following a clinical trial.

Keywords: Administrative claims data; Clinical trial data; Probabilistic linkage; Rheumatoid arthritis.