Unsupervised Classifications of Depression Levels Based on Machine Learning Algorithms Perform Well as Compared to Traditional Norm-Based Classifications

Front Psychiatry. 2020 Feb 14:11:45. doi: 10.3389/fpsyt.2020.00045. eCollection 2020.

Abstract

Large-scale screening for depression has been using norms developed based on a given population at a given time. Researchers have attempted to adjust the cutoff scores over time and for different populations, but such efforts are too few and far in between to be sensitive to temporal and regional variations. In this study, we proposed an unsupervised machine learning approach to constructing depression classifications to overcome the limitations of the traditional norm-based method. Data were collected from 8,063 Chinese middle and high school students. Using k-means clustering, we generated four levels of depressive symptoms to match the norm-based classifications. We then evaluated the validity of the classifications by comparing them with the norm-based method (and its variations) in terms of their robustness, model performance (accuracy, AUC, and sensitivity), and convergent construct validity (i.e., associations with known correlates). The results showed that our automatic classification system performed well as compared to the norm-based method.

Keywords: clustering; depression; norm; scale data; unsupervised classification.