Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM)

Eur Psychiatry. 2017 Jan:39:40-50. doi: 10.1016/j.eurpsy.2016.06.003. Epub 2016 Nov 1.

Abstract

Background: Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through "Graphing lifestyle-environs using machine-learning methods" (GLUMM).

Methods: Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six "lifestyle-environ" variables were used from the National health and nutrition examination study (2009-2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders.

Results: The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, P<0.001) and GLUMM7-1 (OR: 7.88, P<0.001) with depression was found, with significant interactions with those married/living with partner (P=0.001).

Conclusion: Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors.

Keywords: Boosted regression; Cluster; Depression; Lifestyle; Machine learning; Psychiatry.

MeSH terms

  • Adult
  • Algorithms*
  • Cluster Analysis
  • Computer Simulation*
  • Depression / diagnosis*
  • Depressive Disorder / diagnosis
  • Female
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Mental Health*
  • Middle Aged
  • Risk Factors