Leveraging historical data to optimize the number of covariates and their explained variance in the analysis of randomized clinical trials

Stat Methods Med Res. 2022 Feb;31(2):240-252. doi: 10.1177/09622802211065246. Epub 2021 Dec 13.

Abstract

The amount of data collected from patients involved in clinical trials is continuously growing. All baseline patient characteristics are potential covariates that could be used to improve clinical trial analysis and power. However, the limited number of patients in phases I and II studies restricts the possible number of covariates included in the analyses. In this paper, we investigate the cost/benefit ratio of including covariates in the analysis of clinical trials with a continuous outcome. Within this context, we address the long-running question "What is the optimum number of covariates to include in a clinical trial?" To further improve the benefit/cost ratio of covariates, historical data can be leveraged to pre-specify the covariate weights, which can be viewed as the definition of a new composite covariate. Here we analyze the use of a composite covariate to improve the estimated treatment effect in small clinical trials. A composite covariate limits the loss of degrees of freedom and the risk of overfitting.

Keywords: Clinical trial; covariance analysis; placebo effect; regression; relative efficiency.

MeSH terms

  • Computer Simulation*
  • Cost-Benefit Analysis
  • Humans
  • Randomized Controlled Trials as Topic