IncidencePrevalence: An R package to calculate population-level incidence rates and prevalence using the OMOP common data model

Pharmacoepidemiol Drug Saf. 2024 Jan;33(1):e5717. doi: 10.1002/pds.5717. Epub 2023 Oct 25.

Abstract

Purpose: Real-world data (RWD) offers a valuable resource for generating population-level disease epidemiology metrics. We aimed to develop a well-tested and user-friendly R package to compute incidence rates and prevalence in data mapped to the observational medical outcomes partnership (OMOP) common data model (CDM).

Materials and methods: We created IncidencePrevalence, an R package to support the analysis of population-level incidence rates and point- and period-prevalence in OMOP-formatted data. On top of unit testing, we assessed the face validity of the package. To do so, we calculated incidence rates of COVID-19 using RWD from Spain (SIDIAP) and the United Kingdom (CPRD Aurum), and replicated two previously published studies using data from the Netherlands (IPCI) and the United Kingdom (CPRD Gold). We compared the obtained results to those previously published, and measured execution times by running a benchmark analysis across databases.

Results: IncidencePrevalence achieved high agreement to previously published data in CPRD Gold and IPCI, and showed good performance across databases. For COVID-19, incidence calculated by the package was similar to public data after the first-wave of the pandemic.

Conclusion: For data mapped to the OMOP CDM, the IncidencePrevalence R package can support descriptive epidemiological research. It enables reliable estimation of incidence and prevalence from large real-world data sets. It represents a simple, but extendable, analytical framework to generate estimates in a reproducible and timely manner.

Keywords: OMOP; R package; common data model; incidence; prevalence.

MeSH terms

  • COVID-19* / epidemiology
  • Data Management*
  • Databases, Factual
  • Humans
  • Incidence
  • Prevalence

Grants and funding