Simplified Machine Learning Models Can Accurately Identify High-Need High-Cost Patients With Inflammatory Bowel Disease

Clin Transl Gastroenterol. 2022 Jul 1;13(7):e00507. doi: 10.14309/ctg.0000000000000507. Epub 2022 Jun 7.

Abstract

Introduction: Hospitalization is the primary driver of inflammatory bowel disease (IBD)-related healthcare costs and morbidity. Traditional prediction models have poor performance at identifying patients at highest risk of unplanned healthcare utilization. Identification of patients who are high-need and high-cost (HNHC) could reduce unplanned healthcare utilization and healthcare costs.

Methods: We conducted a retrospective cohort study in adult patients hospitalized with IBD using the Nationwide Readmissions Database (model derivation in the 2013 Nationwide Readmission Database and validation in the 2017 Nationwide Readmission Database). We built 2 tree-based algorithms (decision tree classifier and decision tree using gradient boosting framework [XGBoost]) and compared traditional logistic regression to identify patients at risk for becoming HNHC (patients in the highest decile of total days spent in hospital in a calendar year).

Results: Of 47,402 adult patients hospitalized with IBD, we identified 4,717 HNHC patients. The decision tree classifier model (length of stay, Charlson Comorbidity Index, procedure, Frailty Risk Score, and age) had a mean area under the receiver operating characteristic curve (AUC) of 0.78 ± 0.01 in the derivation data set and 0.78 ± 0.02 in the validation data set. XGBoost (length of stay, procedure, chronic pain, drug abuse, and diabetic complication) had a mean AUC of 0.79 ± 0.01 and 0.75 ± 0.02 in the derivation and validation data sets, respectively, compared with AUC 0.55 ± 0.01 and 0.56 ± 0.01 with traditional logistic regression (peptic ulcer disease, paresthesia, admission for osteomyelitis, renal failure, and lymphoma) in derivation and validation data sets, respectively.

Discussion: In hospitalized patients with IBD, simplified tree-based machine learning algorithms using administrative claims data can accurately predict patients at risk of progressing to HNHC.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Chronic Disease
  • Hospitalization
  • Humans
  • Inflammatory Bowel Diseases* / complications
  • Inflammatory Bowel Diseases* / diagnosis
  • Inflammatory Bowel Diseases* / therapy
  • Machine Learning*
  • Retrospective Studies
  • Risk Factors