Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci

Nat Commun. 2022 Aug 24;13(1):4659. doi: 10.1038/s41467-022-32358-1.

Abstract

Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing / genetics
  • Disease Susceptibility
  • Genome-Wide Association Study*
  • Humans
  • Polymorphism, Single Nucleotide
  • Protein Isoforms / genetics
  • Quantitative Trait Loci* / genetics

Substances

  • Protein Isoforms