Inferring disease architecture and predictive ability with LDpred2-auto

Am J Hum Genet. 2023 Dec 7;110(12):2042-2055. doi: 10.1016/j.ajhg.2023.10.010. Epub 2023 Nov 8.

Abstract

LDpred2 is a widely used Bayesian method for building polygenic scores (PGSs). LDpred2-auto can infer the two parameters from the LDpred model, the SNP heritability h2 and polygenicity p, so that it does not require an additional validation dataset to choose best-performing parameters. The main aim of this paper is to properly validate the use of LDpred2-auto for inferring multiple genetic parameters. Here, we present a new version of LDpred2-auto that adds an optional third parameter α to its model, for modeling negative selection. We then validate the inference of these three parameters (or two, when using the previous model). We also show that LDpred2-auto provides per-variant probabilities of being causal that are well calibrated and can therefore be used for fine-mapping purposes. We also introduce a formula to infer the out-of-sample predictive performance r2 of the resulting PGS directly from the Gibbs sampler of LDpred2-auto. Finally, we extend the set of HapMap3 variants recommended to use with LDpred2 with 37% more variants to improve the coverage of this set, and we show that this new set of variants captures 12% more heritability and provides 6% more predictive performance, on average, in UK Biobank analyses.

Keywords: LDpred2; inference.

MeSH terms

  • Bayes Theorem
  • Genome-Wide Association Study* / methods
  • Humans
  • Multifactorial Inheritance* / genetics
  • Polymorphism, Single Nucleotide / genetics