Analysing cluster randomised controlled trials using GLMM, GEE1, GEE2, and QIF: results from four case studies

BMC Med Res Methodol. 2023 Dec 13;23(1):293. doi: 10.1186/s12874-023-02107-z.

Abstract

Background: Using four case studies, we aim to provide practical guidance and recommendations for the analysis of cluster randomised controlled trials.

Methods: Four modelling approaches (Generalized Linear Mixed Models with parameters estimated by maximum likelihood/restricted maximum likelihood; Generalized Linear Models with parameters estimated by Generalized Estimating Equations (1st order or second order) and Quadratic Inference Function, for analysing correlated individual participant level outcomes in cluster randomised controlled trials were identified after we reviewed the literature. We systematically searched the online bibliography databases of MEDLINE, EMBASE, PsycINFO (via OVID), CINAHL (via EBSCO), and SCOPUS. We identified the above-mentioned four statistical analytical approaches and applied them to four case studies of cluster randomised controlled trials with the number of clusters ranging from 10 to 100, and individual participants ranging from 748 to 9,207. Results were obtained for both continuous and binary outcomes using R and SAS statistical packages.

Results: The intracluster correlation coefficient (ICC) estimates for the case studies were less than 0.05 and are consistent with the observed ICC values commonly reported in primary care and community-based cluster randomised controlled trials. In most cases, the four methods produced similar results. However, in a few analyses, quadratic inference function produced different results compared to the generalized linear mixed model, first-order generalized estimating equations, and second-order generalized estimating equations, especially in trials with small to moderate numbers of clusters.

Conclusion: This paper demonstrates the analysis of cluster randomised controlled trials with four modelling approaches. The results obtained were similar in most cases, however, for trials with few clusters we do recommend that the quadratic inference function should be used with caution, and where possible a small sample correction should be used. The generalisability of our results is limited to studies with similar features to our case studies, for example, studies with a similar-sized ICC. It is important to conduct simulation studies to comprehensively evaluate the performance of the four modelling approaches.

Keywords: Cluster randomised controlled trial; Intracluster correlation coefficient; SAS; Statistical methods; Statistical models.

Publication types

  • Systematic Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computer Simulation
  • Humans
  • Linear Models
  • Randomized Controlled Trials as Topic
  • Research Design*
  • Sample Size