Supervised Optimal Chemotherapy Regimen Based on Offline Reinforcement Learning

IEEE J Biomed Health Inform. 2022 Sep;26(9):4763-4772. doi: 10.1109/JBHI.2022.3183854. Epub 2022 Sep 9.

Abstract

In recent years, reinforcement learning (RL) has achieved a remarkable achievement and it has attracted researchers' attention in modeling real-life scenarios by expanding its research beyond conventional complex games. Prediction of optimal treatment regimens from observational real clinical data is being popularized, and more advanced versions of RL algorithms are being implemented in the literature. However, RL-generated medications still need careful supervision of expertise parties or doctors in healthcare. Hence, in this paper, a Supervised Optimal Chemotherapy Regimen (SOCR) approach to investigate optimal chemotherapy-dosing schedule for cancer patients was presented by using Offline Reinforcement Learning. The optimal policy suggested by the RL approach was supervised by incorporating previous treatment decisions of oncologists, which could add clinical expertise knowledge on algorithmic results. Presented SOCR approach followed a model-based architecture using conservative Q-Learning (CQL) algorithm. The developed model was tested using a manually constructed database of forty Stage-IV colon cancer patients, receiving line-1 chemotherapy treatments, who were clinically classified as 'Bevacizumab based patient' and 'Cetuximab based patient'. Experimental results revealed that the supervision from the oncologists has considered the effect to stabilize chemotherapy regimen and it was suggested that the proposed framework could be successfully used as a supportive model for oncologists in deciding their treatment decisions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Humans
  • Neoplasms* / drug therapy
  • Reinforcement, Psychology*