Machine Learning-Based Identification of Obesity from Positive and Unlabelled Electronic Health Records

Stud Health Technol Inform. 2020 Jun 16:270:864-868. doi: 10.3233/SHTI200284.

Abstract

Introduction: Prevalence of overweight and obesity are increas- ing in the last decades, and with them, diseases and health conditions such as diabetes, hypertension or cardiovascular diseases. However, hos- pital databases usually do not record such conditions in adults, neither anthropomorfic measures that facilitate their identification.

Methods: We implemented a machine learning method based on PU (Positive and Unlabelled) Learning to identify obese patients without a diagnose code of obesity in the health records.

Results: The algorithm presented a high sensitivity (98%) and predicted that around 18% of the patients without a diagnosis were obese. This result is consistent with the report of the WHO.

Keywords: Identification; Machine Learning; Obesity; Overweight; PU Learning.

MeSH terms

  • Diabetes Mellitus
  • Electronic Health Records*
  • Humans
  • Machine Learning*
  • Obesity*