Stratified polygenic risk prediction model with application to CAGI bipolar disorder sequencing data

Maggie Haitian Wang; Billy Chang; Rui Sun; Inchi Hu; Xiaoxuan Xia; William Ka Kei Wu; Ka Chun Chong; Benny Chung-Ying Zee

doi:10.1002/humu.23229

Stratified polygenic risk prediction model with application to CAGI bipolar disorder sequencing data

Hum Mutat. 2017 Sep;38(9):1235-1239. doi: 10.1002/humu.23229. Epub 2017 Jun 13.

Authors

Maggie Haitian Wang^{1

2}, Billy Chang¹, Rui Sun^{1

2}, Inchi Hu³, Xiaoxuan Xia¹, William Ka Kei Wu⁴, Ka Chun Chong^{1

2}, Benny Chung-Ying Zee^{1

2}

Affiliations

¹ Division of Biostatistics and Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong SAR, China.
² CUHK Shenzhen Research Institute, Shenzhen, China.
³ ISOM Department and Biomedical Engineering Division, The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, China.
⁴ Department of Anaethesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong SAR, China.

Abstract

Genetic data consists of a wide range of marker types, including common, low-frequency, and rare variants. Multiple genetic markers and their interactions play central roles in the heritability of complex disease. In this study, we propose an algorithm that uses a stratified variable selection design by genetic architectures and interaction effects, achieved by a dataset-adaptive W-test. The polygenic sets in all strata were integrated to form a classification rule. The algorithm was applied to the Critical Assessment of Genome Interpretation 4 bipolar challenge sequencing data. The prediction accuracy was 60% using genetic markers on an independent test set. We found that epistasis among common genetic variants contributed most substantially to prediction precision. However, the sample size was not large enough to draw conclusions for the lack of predictability of low-frequency variants and their epistasis.

Keywords: W-test; bipolar; classification of complex disorder; disease prediction; epistasis; interaction effect; mutation; polygenic risk stratification.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Bipolar Disorder / genetics*
Epistasis, Genetic
Genetic Predisposition to Disease
Humans
Models, Genetic
Polymorphism, Single Nucleotide*
Sequence Analysis, DNA / methods*

Abstract

Publication types

MeSH terms

Grants and funding