Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning

Elizabeth F Krakow; Michael Hemmer; Tao Wang; Brent Logan; Mukta Arora; Stephen Spellman; Daniel Couriel; Amin Alousi; Joseph Pidala; Michael Last; Silvy Lachance; Erica E M Moodie

doi:10.1093/aje/kwx027

Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning

Am J Epidemiol. 2017 Jul 15;186(2):160-172. doi: 10.1093/aje/kwx027.

Authors

Elizabeth F Krakow, Michael Hemmer, Tao Wang, Brent Logan, Mukta Arora, Stephen Spellman, Daniel Couriel, Amin Alousi, Joseph Pidala, Michael Last, Silvy Lachance, Erica E M Moodie

Abstract

Q-learning is a method of reinforcement learning that employs backwards stagewise estimation to identify sequences of actions that maximize some long-term reward. The method can be applied to sequential multiple-assignment randomized trials to develop personalized adaptive treatment strategies (ATSs)-longitudinal practice guidelines highly tailored to time-varying attributes of individual patients. Sometimes, the basis for choosing which ATSs to include in a sequential multiple-assignment randomized trial (or randomized controlled trial) may be inadequate. Nonrandomized data sources may inform the initial design of ATSs, which could later be prospectively validated. In this paper, we illustrate challenges involved in using nonrandomized data for this purpose with a case study from the Center for International Blood and Marrow Transplant Research registry (1995-2007) aimed at 1) determining whether the sequence of therapeutic classes used in graft-versus-host disease prophylaxis and in refractory graft-versus-host disease is associated with improved survival and 2) identifying donor and patient factors with which to guide individualized immunosuppressant selections over time. We discuss how to communicate the potential benefit derived from following an ATS at the population and subgroup levels and how to evaluate its robustness to modeling assumptions. This worked example may serve as a model for developing ATSs from registries and cohorts in oncology and other fields requiring sequential treatment decisions.

Keywords: Q-learning; adaptive treatment strategies; dynamic treatment regimes; graft-versus-host disease; machine learning; personalized medicine; prediction; registry data.

© The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Publication types

Research Support, U.S. Gov't, P.H.S.
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, Non-U.S. Gov't

MeSH terms

Adolescent
Adult
Child
Data Interpretation, Statistical
Disease-Free Survival
Endpoint Determination
Graft vs Host Disease / etiology
Graft vs Host Disease / prevention & control*
Graft vs Host Disease / therapy
Humans
Karnofsky Performance Status
Linear Models
Precision Medicine / methods*
Randomized Controlled Trials as Topic / methods
Randomized Controlled Trials as Topic / statistics & numerical data
Registries
Transplantation, Homologous / statistics & numerical data
Young Adult

Abstract

Publication types

MeSH terms

Grants and funding