AI-based multi-PRS models outperform classical single-PRS models

Front Genet. 2023 Jun 27:14:1217860. doi: 10.3389/fgene.2023.1217860. eCollection 2023.

Abstract

Polygenic risk scores (PRS) calculate the risk for a specific disease based on the weighted sum of associated alleles from different genetic loci in the germline estimated by regression models. Recent advances in genetics made it possible to create polygenic predictors of complex human traits, including risks for many important complex diseases, such as cancer, diabetes, or cardiovascular diseases, typically influenced by many genetic variants, each of which has a negligible effect on overall risk. In the current study, we analyzed whether adding additional PRS from other diseases to the prediction models and replacing the regressions with machine learning models can improve overall predictive performance. Results showed that multi-PRS models outperform single-PRS models significantly on different diseases. Moreover, replacing regression models with machine learning models, i.e., deep learning, can also improve overall accuracy.

Keywords: breast cancer; deep learning; machine learning; polygenic risk score; regression.

Grants and funding

This work was financially supported by the German Federal Ministry of Education and Research (BMBF) [031L0267A] (Deep Insight).