Uncovering psychiatric phenotypes using unsupervised machine learning: A data-driven symptoms approach

Eur Psychiatry. 2023 Feb 21;66(1):e27. doi: 10.1192/j.eurpsy.2023.13.

Abstract

Background: Current categorical classification systems of psychiatric diagnoses lead to heterogeneity of symptoms within disorders and common co-occurrence of disorders. We investigated the heterogeneous and overlapping nature of symptom endorsement in a population-based sample across three of the most common categories of psychiatric disorders: depressive disorders, anxiety disorders, and sleep-wake disorders using unsupervised machine learning approaches.

Methods: We assessed a total of 43 symptoms in a discovery sample of 6,602 participants of the population-based Rotterdam Study between 2009 and 2013, and in a replication sample of 3,005 participants between 2016 and 2020. Symptoms were assessed using the Center for Epidemiologic Studies Depression Scale, the Hospital Anxiety and Depression Scale, and the Pittsburgh Sleep Quality Index. Hierarchical clustering analysis was applied on test items and participants to investigate common patterns of symptoms co-occurrence, and further quantitatively investigated with clustering methods to find groups that may represent similar psychiatric phenotypes.

Results: First, clustering analyses of the questionnaire items suggested a three-cluster solution representing clusters of "mixed" symptoms, "depressed affect and nervousness", and "troubled sleep and interpersonal problems". A highly similar clustering solution was independently established in the replication sample. Second, four groups of participants could be separated, and these groups scored differently on the item clusters.

Conclusions: We identified three clusters of psychiatric symptoms that most commonly co-occur in a population-based sample. These symptoms clustered stable over samples, but across the topics of depression, anxiety, and poor sleep. We identified four groups of participants that share (sub)clinical symptoms and might benefit from similar prevention or treatment strategies, despite potentially diverging, or lack of, diagnoses.

Keywords: Anxiety disorders; depression; machine learning; sleep–wake disorders.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anxiety
  • Anxiety Disorders / diagnosis
  • Anxiety Disorders / epidemiology
  • Depression
  • Humans
  • Sleep Initiation and Maintenance Disorders*
  • Unsupervised Machine Learning*