Completeness and representativeness of small area socioeconomic data linked with the UK Clinical Practice Research Datalink (CPRD)

J Epidemiol Community Health. 2022 Jul 28;76(10):880-886. doi: 10.1136/jech-2022-219200. Online ahead of print.

Abstract

Background: The Clinical Practice Research Datalink (CPRD) holds primary care electronic healthcare records for 25% of the UK population. CPRD data can be linked via practice postcode in the UK, and additionally via patient postcode in England, to area-level socioeconomic status (SES) data including the Index of Multiple Deprivation (IMD), the Carstairs Index and the Townsend Deprivation Index; as well as rural-urban classification (RUC). This study aims to describe the completeness and representativeness of CPRD-linked SES and RUC data.

Methods: Patients currently registered at general practices contributing data to the May 2021 snapshots of CPRD GOLD (n=445 587) and CPRD Aurum (n=13 278 825) were used to assess the completeness and representativeness of CPRD-linked SES and RUC data against the UK general population.

Results: All currently registered patients had complete SES and RUC data at practice level across the UK. Most English patients in CPRD GOLD (78%), CPRD Aurum (94%) and combined (93%) had SES and RUC data at patient level. Patient-level SES data in CPRD for England were comparable to England's general population (average IMD decile in CPRD 5.52±0.00 vs 5.50±0.02). CPRD UK practices were on average in more deprived areas than the UK general population (6.06±0.07 vs 5.50±0.02). A slightly higher proportion of CPRD patients and practices were from urban areas (85%) as compared with the UK general population (82%).

Conclusion: Completeness of CPRD-linked SES and RUC data is high. The CPRD populations were broadly representative of the general populations in the UK in terms of SES and RUC.

Keywords: biostatistics; epidemiology; public health; social class.