Effective Treatment Recommendations for Type 2 Diabetes Management Using Reinforcement Learning: Treatment Recommendation Model Development and Validation

Xingzhi Sun; Yong Mong Bee; Shao Wei Lam; Zhuo Liu; Wei Zhao; Sing Yi Chia; Hanis Abdul Kadir; Jun Tian Wu; Boon Yew Ang; Nan Liu; Zuo Lei; Zhuoyang Xu; Tingting Zhao; Gang Hu; Guotong Xie

doi:10.2196/27858

Effective Treatment Recommendations for Type 2 Diabetes Management Using Reinforcement Learning: Treatment Recommendation Model Development and Validation

J Med Internet Res. 2021 Jul 22;23(7):e27858. doi: 10.2196/27858.

Authors

Xingzhi Sun^#¹, Yong Mong Bee^#^{2

3}, Shao Wei Lam^{4

5}, Zhuo Liu¹, Wei Zhao¹, Sing Yi Chia⁶, Hanis Abdul Kadir^{5

6}, Jun Tian Wu⁴, Boon Yew Ang⁴, Nan Liu^{4

5}, Zuo Lei¹, Zhuoyang Xu¹, Tingting Zhao¹, Gang Hu¹, Guotong Xie^{1

7

8}

Affiliations

¹ Ping An Healthcare Technology, Beijing, China.
² Department of Endocrinology, Singapore General Hospital, Singapore, Singapore.
³ SingHealth Duke-NUS Diabetes Centre, Singapore Health Services, Singapore, Singapore.
⁴ Health Services Research Centre, Singapore Health Services, Singapore, Singapore.
⁵ Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore.
⁶ Health Services Research Unit, Singapore General Hospital, Singapore, Singapore.
⁷ Ping An Healthcare and Technology Co, Ltd, Shanghai, China.
⁸ Ping An International Smart City Technology Co, Ltd, Shenzhen, China.

^# Contributed equally.

PMID: 34292166
PMCID: PMC8367185
DOI: 10.2196/27858

Abstract

Background: Type 2 diabetes mellitus (T2DM) and its related complications represent a growing economic burden for many countries and health systems. Diabetes complications can be prevented through better disease control, but there is a large gap between the recommended treatment and the treatment that patients actually receive. The treatment of T2DM can be challenging because of different comprehensive therapeutic targets and individual variability of the patients, leading to the need for precise, personalized treatment.

Objective: The aim of this study was to develop treatment recommendation models for T2DM based on deep reinforcement learning. A retrospective analysis was then performed to evaluate the reliability and effectiveness of the models.

Methods: The data used in our study were collected from the Singapore Health Services Diabetes Registry, encompassing 189,520 patients with T2DM, including 6,407,958 outpatient visits from 2013 to 2018. The treatment recommendation model was built based on 80% of the dataset and its effectiveness was evaluated with the remaining 20% of data. Three treatment recommendation models were developed for antiglycemic, antihypertensive, and lipid-lowering treatments by combining a knowledge-driven model and a data-driven model. The knowledge-driven model, based on clinical guidelines and expert experiences, was first applied to select the candidate medications. The data-driven model, based on deep reinforcement learning, was used to rank the candidates according to the expected clinical outcomes. To evaluate the models, short-term outcomes were compared between the model-concordant treatments and the model-nonconcordant treatments with confounder adjustment by stratification, propensity score weighting, and multivariate regression. For long-term outcomes, model-concordant rates were included as independent variables to evaluate if the combined antiglycemic, antihypertensive, and lipid-lowering treatments had a positive impact on reduction of long-term complication occurrence or death at the patient level via multivariate logistic regression.

Results: The test data consisted of 36,993 patients for evaluating the effectiveness of the three treatment recommendation models. In 43.3% of patient visits, the antiglycemic medications recommended by the model were concordant with the actual prescriptions of the physicians. The concordant rates for antihypertensive medications and lipid-lowering medications were 51.3% and 58.9%, respectively. The evaluation results also showed that model-concordant treatments were associated with better glycemic control (odds ratio [OR] 1.73, 95% CI 1.69-1.76), blood pressure control (OR 1.26, 95% CI, 1.23-1.29), and blood lipids control (OR 1.28, 95% CI 1.22-1.35). We also found that patients with more model-concordant treatments were associated with a lower risk of diabetes complications (including 3 macrovascular and 2 microvascular complications) and death, suggesting that the models have the potential of achieving better outcomes in the long term.

Conclusions: Comprehensive management by combining knowledge-driven and data-driven models has good potential to help physicians improve the clinical outcomes of patients with T2DM; achieving good control on blood glucose, blood pressure, and blood lipids; and reducing the risk of diabetes complications in the long term.

Keywords: long-term outcome; model concordance; reinforcement learning; short-term outcome; type 2 diabetes.

©Xingzhi Sun, Yong Mong Bee, Shao Wei Lam, Zhuo Liu, Wei Zhao, Sing Yi Chia, Hanis Abdul Kadir, Jun Tian Wu, Boon Yew Ang, Nan Liu, Zuo Lei, Zhuoyang Xu, Tingting Zhao, Gang Hu, Guotong Xie. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 22.07.2021.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Blood Glucose
Diabetes Mellitus, Type 2* / drug therapy
Humans
Reproducibility of Results
Retrospective Studies
Treatment Outcome

Substances

Blood Glucose