Identifying individual risk rare variants using protein structure guided local tests (POINT)

PLoS Comput Biol. 2019 Feb 19;15(2):e1006722. doi: 10.1371/journal.pcbi.1006722. eCollection 2019 Feb.

Abstract

Rare variants are of increasing interest to genetic association studies because of their etiological contributions to human complex diseases. Due to the rarity of the mutant events, rare variants are routinely analyzed on an aggregate level. While aggregation analyses improve the detection of global-level signal, they are not able to pinpoint causal variants within a variant set. To perform inference on a localized level, additional information, e.g., biological annotation, is often needed to boost the information content of a rare variant. Following the observation that important variants are likely to cluster together on functional domains, we propose a protein structure guided local test (POINT) to provide variant-specific association information using structure-guided aggregation of signal. Constructed under a kernel machine framework, POINT performs local association testing by borrowing information from neighboring variants in the 3-dimensional protein space in a data-adaptive fashion. Besides merely providing a list of promising variants, POINT assigns each variant a p-value to permit variant ranking and prioritization. We assess the selection performance of POINT using simulations and illustrate how it can be used to prioritize individual rare variants in PCSK9, ANGPTL4 and CETP in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) clinical trial data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Angiopoietin-Like Protein 4 / genetics
  • Cholesterol Ester Transfer Proteins / genetics
  • Computational Biology / methods*
  • Computer Simulation
  • Genetic Association Studies / methods*
  • Genetic Predisposition to Disease / genetics
  • Genetic Variation / genetics
  • Humans
  • Models, Genetic
  • Proprotein Convertase 9 / genetics
  • Protein Structure, Tertiary
  • Risk Factors
  • Sequence Analysis, DNA / methods*

Substances

  • ANGPTL4 protein, human
  • Angiopoietin-Like Protein 4
  • CETP protein, human
  • Cholesterol Ester Transfer Proteins
  • PCSK9 protein, human
  • Proprotein Convertase 9