Discrimination between healthy participants and people with panic disorder based on polygenic scores for psychiatric disorders and for intermediate phenotypes using machine learning

Kazutaka Ohi; Yuta Tanaka; Takeshi Otowa; Mihoko Shimada; Hisanobu Kaiya; Fumichika Nishimura; Tsukasa Sasaki; Hisashi Tanii; Toshiki Shioiri; Takeshi Hara

doi:10.1177/00048674241242936

Discrimination between healthy participants and people with panic disorder based on polygenic scores for psychiatric disorders and for intermediate phenotypes using machine learning

Aust N Z J Psychiatry. 2024 Apr 6:48674241242936. doi: 10.1177/00048674241242936. Online ahead of print.

Authors

Kazutaka Ohi^{1

2}, Yuta Tanaka³, Takeshi Otowa⁴, Mihoko Shimada⁵, Hisanobu Kaiya⁶, Fumichika Nishimura⁷, Tsukasa Sasaki⁸, Hisashi Tanii^{9

10}, Toshiki Shioiri¹, Takeshi Hara³

Affiliations

¹ Department of Psychiatry, Gifu University Graduate School of Medicine, Gifu, Japan.
² Department of General Internal Medicine, Kanazawa Medical University, Ishikawa, Japan.
³ Department of Intelligence Science and Engineering, Gifu University Graduate School of Natural Science and Technology, Gifu, Japan.
⁴ Department of Psychiatry, East Medical Center, Nagoya City University, Nagoya, Japan.
⁵ Genome Medical Science Project (Toyama), National Center for Global Health and Medicine (NCGM), Tokyo, Japan.
⁶ Panic Disorder Research Center, Warakukai Medical Corporation, Tokyo, Japan.
⁷ Center for Research on Counseling and Support Services, The University of Tokyo, Tokyo, Japan.
⁸ Department of Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan.
⁹ Center for Physical and Mental Health, Mie University, Mie, Japan.
¹⁰ Graduate School of Medicine, Department of Health Promotion and Disease Prevention, Mie University, Mie, Japan.

PMID: 38581251
DOI: 10.1177/00048674241242936

Abstract

Objective: Panic disorder is a modestly heritable condition. Currently, diagnosis is based only on clinical symptoms; identifying objective biomarkers and a more reliable diagnostic procedure is desirable. We investigated whether people with panic disorder can be reliably diagnosed utilizing combinations of multiple polygenic scores for psychiatric disorders and their intermediate phenotypes, compared with single polygenic score approaches, by applying specific machine learning techniques.

Methods: Polygenic scores for 48 psychiatric disorders and intermediate phenotypes based on large-scale genome-wide association studies (n = 7556-1,131,881) were calculated for people with panic disorder (n = 718) and healthy controls (n = 1717). Discrimination between people with panic disorder and healthy controls was based on the 48 polygenic scores using five methods for classification: logistic regression, neural networks, quadratic discriminant analysis, random forests and a support vector machine. Differences in discrimination accuracy (area under the curve) due to an increased number of polygenic score combinations and differences in the accuracy across five classifiers were investigated.

Results: All five classifiers performed relatively well for distinguishing people with panic disorder from healthy controls by increasing the number of polygenic scores. Of the 48 polygenic scores, the polygenic score for anxiety UK Biobank was the most useful for discrimination by the classifiers. In combinations of two or three polygenic scores, the polygenic score for anxiety UK Biobank was included as one of polygenic scores in all classifiers. When all 48 polygenic scores were used in combination, the greatest areas under the curve significantly differed among the five classifiers. Support vector machine and logistic regression had higher accuracy than quadratic discriminant analysis and random forests. For each classifier, the greatest area under the curve was 0.600 ± 0.030 for logistic regression (polygenic score combinations N = 14), 0.591 ± 0.039 for neural networks (N = 9), 0.603 ± 0.033 for quadratic discriminant analysis (N = 10), 0.572 ± 0.039 for random forests (N = 25) and 0.617 ± 0.041 for support vector machine (N = 11). The greatest areas under the curve at the best polygenic score combination significantly differed among the five classifiers. Random forests had the lowest accuracy among classifiers. Support vector machine had higher accuracy than neural networks.

Conclusions: These findings suggest that increasing the number of polygenic score combinations up to approximately 10 effectively improved the discrimination accuracy and that support vector machine exhibited greater accuracy among classifiers. However, the discrimination accuracy for panic disorder, when based solely on polygenic score combinations, was found to be modest.

Keywords: Panic disorder; classifier; intermediate phenotype; machine learning; polygenic score.