Application of Machine Learning Models to Evaluate Hypoglycemia Risk in Type 2 Diabetes

Diabetes Ther. 2020 Mar;11(3):681-699. doi: 10.1007/s13300-020-00759-4. Epub 2020 Feb 3.

Abstract

Introduction: To identify predictors of hypoglycemia and five other clinical and economic outcomes among treated patients with type 2 diabetes (T2D) using machine learning and structured data from a large, geographically diverse administrative claims database.

Methods: A retrospective cohort study design was applied to Optum Clinformatics claims data indexed on first antidiabetic prescription date. A hypothesis-free, Bayesian machine learning analytics platform (GNS Healthcare REFS™: Reverse Engineering and Forward Simulation) was used to build ensembles of generalized linear models to predict six outcomes defined in patients' 1-year post-index claims history, including hypoglycemia, antidiabetic class persistence, glycated hemoglobin (HbA1c) target attainment, HbA1c change, T2D-related inpatient admissions, and T2D-related medical costs. A unified set of 388 variables defined in patients' 1-year pre-index claims history constituted the set of predictors for all REFS models.

Results: The derivation cohort comprised 453,487 patients with a T2D diagnosis between 2014 and 2017. Patients with comorbid conditions had the highest risk of hypoglycemia, including those with prior hypoglycemia (odds ratio [OR] = 25.61) and anemia (OR = 1.29). Other identified risk factors included insulin (OR = 2.84) and sulfonylurea use (OR = 1.80). Biguanide use (OR = 0.75), high blood glucose (> 125 mg/dL vs. < 100 mg/dL, OR = 0.47; 100-125 mg/dL vs. < 100 mg/dL, OR = 0.53), and missing blood glucose test (OR = 0.40) were associated with reduced risk of hypoglycemia. Area under the curve (AUC) of the hypoglycemia model in held-out testing data was 0.77. Patients in the top 15% of predicted hypoglycemia risk constituted 50% of observed hypoglycemic events, 26% of T2D-related inpatient admissions, and 24% of all T2D-related medical costs.

Conclusions: Machine learning models built within high-dimensional, real-world data can predict patients at risk of clinical outcomes with a high degree of accuracy, while uncovering important factors associated with outcomes that can guide clinical practice. Targeted interventions towards these patients may help reduce hypoglycemia risk and thereby favorably impact associated economic outcomes relevant to key stakeholders.

Keywords: Healthcare costs; Hypoglycemia; Machine learning; Resource utilization; Type 2 diabetes; Value-based.