Multiple imputation for handling missing outcome data in randomized trials involving a mixture of independent and paired data

Thomas R Sullivan; Lisa N Yelland; Margarita Moreno-Betancur; Katherine J Lee

doi:10.1002/sim.9166

Multiple imputation for handling missing outcome data in randomized trials involving a mixture of independent and paired data

Stat Med. 2021 Nov 30;40(27):6008-6020. doi: 10.1002/sim.9166. Epub 2021 Aug 15.

Authors

Thomas R Sullivan^{1

2}, Lisa N Yelland^{1

2}, Margarita Moreno-Betancur^{3

4}, Katherine J Lee^{3

4}

Affiliations

¹ SAHMRI Women & Kids, South Australian Health & Medical Research Institute, Adelaide, South Australia, Australia.
² School of Public Health, The University of Adelaide, Adelaide, South Australia, Australia.
³ Department of Paediatrics, The University of Melbourne, Melbourne, Victoria, Australia.
⁴ Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Melbourne, Victoria, Australia.

PMID: 34396577
DOI: 10.1002/sim.9166

Abstract

Randomized trials involving independent and paired observations occur in many areas of health research, for example in paediatrics, where studies can include infants from both single and twin births. Multiple imputation (MI) is often used to address missing outcome data in randomized trials, yet its performance in trials with independent and paired observations, where design effects can be less than or greater than one, remains to be explored. Using simulated data and through application to a trial dataset, we investigated the performance of different methods of MI for a continuous or binary outcome when followed by analysis using generalized estimating equations to account for clustering due to the pairs. We found that imputing data separately for independent and paired data, with paired data imputed in wide format, was the best performing MI method, producing unbiased point and standard error estimates for the treatment effect throughout. Ignoring clustering in the imputation model performed well in settings where the design effect due to the inclusion of paired data was close to one, but otherwise led to moderately biased variance estimates. Including a random cluster effect in the imputation model led to slightly biased point estimates for binary outcome data and variance estimates that were too small in some settings. Based on these results, we recommend researchers impute independent and paired data separately where feasible to do so. The exception is if the design effect due to the inclusion of paired data is close to one, where ignoring clustering may be appropriate.

Keywords: clinical trials; clustered data; missing outcome data; multiple imputation.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Cluster Analysis
Computer Simulation
Data Interpretation, Statistical*
Humans
Randomized Controlled Trials as Topic*