A Multilayered Post-Genome-Wide Association Study Analysis Pipeline Defines Functional Variants and Target Genes for Systemic Lupus Erythematosus

Arthritis Rheumatol. 2024 Feb 19. doi: 10.1002/art.42829. Online ahead of print.

Abstract

Objective: Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWASs) have revealed multiple SLE susceptibility loci and associated single-nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown.

Methods: Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci (cis- and trans-expression quantitative trait loci) with promoter capture high-throughput capture chromatin conformation (PCHi-C), allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, quantitative polymerase chain reaction, and Western blot.

Results: Anchoring on 452 index SNPs, we selected 9,931 high linkage disequilibrium (r2 > 0.8) SNPs and defined 182 independent non-human leukocyte antigen (HLA) SLE loci. The 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated reference SNP identifier 57668933 (rs57668933) as a functional variant regulating multiple targets, including SLE-risk gene ELF1 in B cells.

Conclusion: We demonstrate and validate post-GWAS strategies for using multidimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms to guide experimental characterization.