Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care

Anna L Trella; Kelly W Zhang; Inbal Nahum-Shani; Vivek Shetty; Finale Doshi-Velez; Susan A Murphy

doi:10.1609/aaai.v37i13.26866

Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care

Proc Innov Appl Artif Intell Conf. 2023 Jun 27;37(13):15724-15730. doi: 10.1609/aaai.v37i13.26866.

Authors

Anna L Trella¹, Kelly W Zhang¹, Inbal Nahum-Shani², Vivek Shetty³, Finale Doshi-Velez¹, Susan A Murphy¹

Affiliations

¹ Department of Computer Science, Harvard University.
² Institute for Social Research, University of Michigan.
³ Schools of Dentistry & Engineering, University of California, Los Angeles.

Abstract

While dental disease is largely preventable, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in developing such an algorithm is ensuring that the algorithm considers the impact of current actions on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been designed to run stably and autonomously in a constrained, real-world setting characterized by highly noisy, sparse data. We address this challenge by designing a quality reward that maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics. To the best of our knowledge, Oralytics is the first mobile health study utilizing an RL algorithm designed to prevent dental disease by optimizing the delivery of motivational messages supporting oral self-care behaviors.

Abstract

Grants and funding