Bayesian Dose-Finding in Two Treatment Cycles Based on the Joint Utility of Efficacy and Toxicity

J Am Stat Assoc. 2015 Jun 1;110(510):711-722. doi: 10.1080/01621459.2014.926815.

Abstract

A phase I/II clinical trial design is proposed for adaptively and dynamically optimizing each patient's dose in each of two cycles of therapy based on the joint binary efficacy and toxicity outcomes in each cycle. A dose-outcome model is assumed that includes a Bayesian hierarchical latent variable structure to induce association among the outcomes and also facilitate posterior computation. Doses are chosen in each cycle based on posteriors of a model-based objective function, similar to a reinforcement learning or Q-learning function, defined in terms of numerical utilities of the joint outcomes in each cycle. For each patient, the procedure outputs a sequence of two actions, one for each cycle, with each action being the decision to either treat the patient at a chosen dose or not to treat. The cycle 2 action depends on the individual patient's cycle 1 dose and outcomes. In addition, decisions are based on posterior inference using other patients' data, and therefore the proposed method is adaptive both within and between patients. A simulation study of the method is presented, including comparison to two-cycle extensions of the conventional 3+3 algorithm, continual reassessment method, and a Bayesian model-based design, and evaluation of robustness.

Keywords: Adaptive Design; Bayesian Design; Dynamic Treatment Regime; Latent Probit Model; Phase I-II Clinical Trial; Q-Learning.