A Machine Learning Analysis of Big Metabolomics Data for Classifying Depression: Model Development and Validation

Biol Psychiatry. 2023 Dec 22:S0006-3223(23)01792-4. doi: 10.1016/j.biopsych.2023.12.015. Online ahead of print.

Abstract

Background: Many metabolomics studies of depression have been performed, but these have been limited by their scale. A comprehensive in silico analysis of global metabolite levels in large populations could provide robust insights into the pathological mechanisms underlying depression and candidate clinical biomarkers.

Methods: Depression-associated metabolomics was studied in 2 datasets from the UK Biobank database: participants with lifetime depression (N = 123,459) and participants with current depression (N = 94,921). The Whitehall II cohort (N = 4744) was used for external validation. CatBoost machine learning was used for modeling, and Shapley additive explanations were used to interpret the model. Fivefold cross-validation was used to validate model performance, training the model on 3 of the 5 sets with the remaining 2 sets for validation and testing, respectively. Diagnostic performance was assessed using the area under the receiver operating characteristic curve.

Results: In the lifetime depression and current depression datasets and sex-specific analyses, 24 significantly associated metabolic biomarkers were identified, 12 of which overlapped in the 2 datasets. The addition of metabolic features slightly improved the performance of a diagnostic model using traditional (nonmetabolomics) risk factors alone (lifetime depression: area under the curve 0.655 vs. 0.658 with metabolomics; current depression: area under the curve 0.711 vs. 0.716 with metabolomics).

Conclusions: The machine learning model identified 24 metabolic biomarkers associated with depression. If validated, metabolic biomarkers may have future clinical applications as supplementary information to guide early and population-based depression detection.

Keywords: Biomarkers; Depression; Machine learning; Metabolomics; UK Biobank; Whitehall II cohort.