Subgroup analyses in randomized controlled trials frequently categorized continuous subgroup information

S Faye Williamson; Michael J Grayling; Adrian P Mander; Nurulamin M Noor; Joshua S Savage; Christina Yap; James M S Wason

doi:10.1016/j.jclinepi.2022.06.017

Subgroup analyses in randomized controlled trials frequently categorized continuous subgroup information

J Clin Epidemiol. 2022 Oct:150:72-79. doi: 10.1016/j.jclinepi.2022.06.017. Epub 2022 Jul 2.

Authors

S Faye Williamson¹, Michael J Grayling¹, Adrian P Mander², Nurulamin M Noor³, Joshua S Savage⁴, Christina Yap⁵, James M S Wason⁶

Affiliations

¹ Biostatistics Research Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK.
² Centre for Trials Research, Cardiff University, Cardiff, UK.
³ Medical Research Council Clinical Trials Unit at University College London (MRC CTU at UCL), London, UK.
⁴ Cancer Research UK Clinical Trials Unit (CRCTU), Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK.
⁵ Clinical Trials and Statistics Unit, The Institute of Cancer Research, London, UK.
⁶ Biostatistics Research Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK. Electronic address: james.wason@newcastle.ac.uk.

PMID: 35788399
DOI: 10.1016/j.jclinepi.2022.06.017

Abstract

Background and objectives: To investigate how subgroup analyses of published Randomized Controlled Trials (RCTs) are performed when subgroups are created from continuous variables.

Methods: We carried out a review of RCTs published in 2016-2021 that included subgroup analyses. Information was extracted on whether any of the subgroups were based on continuous variables and, if so, how they were analyzed.

Results: Out of 428 reviewed papers, 258 (60.4%) reported RCTs with a subgroup analysis. Of these, 178/258 (69%) had at least one subgroup formed from a continuous variable and 14/258 (5.4%) were unclear. The vast majority (169/178, 94.9%) dichotomized the continuous variable and treated the subgroup as categorical. The most common way of dichotomizing was using a pre-specified cutpoint (129/169, 76.3%), followed by a data-driven cutpoint (26/169, 15.4%), such as the median.

Conclusion: It is common for subgroup analyses to use continuous variables to define subgroups. The vast majority dichotomize the continuous variable and, consequently, may lose substantial amounts of statistical information (equivalent to reducing the sample size by at least a third). More advanced methods that can improve efficiency, through optimally choosing cutpoints or directly using the continuous information, are rarely used.

Keywords: Categorization; Continuous variables; Dichotomization; Moderator analysis; Randomized controlled trials; Subgroup analysis.

Subgroup analyses in randomized controlled trials frequently categorized continuous subgroup information

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding