Adaptive selection of the optimal strategy to improve precision and power in randomized trials

Laura B Balzer; Erica Cai; Lucas Godoy Garraza; Pracheta Amaranath

doi:10.1093/biomtc/ujad034

Adaptive selection of the optimal strategy to improve precision and power in randomized trials

Biometrics. 2024 Jan 29;80(1):ujad034. doi: 10.1093/biomtc/ujad034.

Authors

Laura B Balzer¹, Erica Cai², Lucas Godoy Garraza³, Pracheta Amaranath²

Affiliations

¹ Division of Biostatistics, University of California Berkeley, Berkeley, CA 94720, United States.
² Manning College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, MA 01003, United States.
³ Department of Biostatistics, University of Massachusetts Amherst, Amherst, MA 01003, United States.

Abstract

Benkeser et al. demonstrate how adjustment for baseline covariates in randomized trials can meaningfully improve precision for a variety of outcome types. Their findings build on a long history, starting in 1932 with R.A. Fisher and including more recent endorsements by the U.S. Food and Drug Administration and the European Medicines Agency. Here, we address an important practical consideration: how to select the adjustment approach-which variables and in which form-to maximize precision, while maintaining Type-I error control. Balzer et al. previously proposed Adaptive Pre-specification within TMLE to flexibly and automatically select, from a prespecified set, the approach that maximizes empirical efficiency in small trials (N < 40). To avoid overfitting with few randomized units, selection was previously limited to working generalized linear models, adjusting for a single covariate. Now, we tailor Adaptive Pre-specification to trials with many randomized units. Using V-fold cross-validation and the estimated influence curve-squared as the loss function, we select from an expanded set of candidates, including modern machine learning methods adjusting for multiple covariates. As assessed in simulations exploring a variety of data-generating processes, our approach maintains Type-I error control (under the null) and offers substantial gains in precision-equivalent to 20%-43% reductions in sample size for the same statistical power. When applied to real data from ACTG Study 175, we also see meaningful efficiency improvements overall and within subgroups.

Keywords: TMLE; covariate adjustment; efficiency; machine learning; pre-specification; randomized trials.

MeSH terms

Linear Models
Machine Learning*
Randomized Controlled Trials as Topic
Research Design*
Sample Size
United States

Abstract

MeSH terms

Grants and funding