Improving risk prediction for depression via Elastic Net regression - Results from Korea National Health Insurance Services Data

AMIA Annu Symp Proc. 2017 Feb 10:2016:1860-1869. eCollection 2016.

Abstract

Depression, despite its high prevalence, remains severely under-diagnosed across the healthcare system. This demands the development of data-driven approaches that can help screen patients who are at a high risk of depression. In this work, we develop depression risk prediction models that incorporate disease co-morbidities using logistic regression with Elastic Net. Using data from the one million twelve-year longitudinal cohort from Korean National Health Insurance Services (KNHIS), our model achieved an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) of 0.7818, compared to a traditional logistic regression model without co-morbidity analysis (AUC of 0.6992). We also showed co-morbidity adjusted Odds Ratios (ORs), which may be more accurate independent estimate of each predictor variable. In conclusion, inclusion of co-morbidity analysis improved the performance of depression risk prediction models.

Keywords: Chronic Conditions Data Warehouse (CCW) Condition Algorithms; Co-morbidity; Depression; Elastic Net; Korea National Health Insurance Services Longitudinal Cohort Data; Least Absolute Shrinkage And Selection Operator (LASSO); Logistic Regression; Risk Prediction Model.

MeSH terms

  • Adult
  • Area Under Curve
  • Comorbidity*
  • Depression / epidemiology*
  • Female
  • Humans
  • Logistic Models
  • Male
  • Middle Aged
  • National Health Programs*
  • Prevalence
  • ROC Curve
  • Republic of Korea / epidemiology
  • Risk