Predicting functional consequences of SNPs on mRNA translation via machine learning

Nucleic Acids Res. 2023 Aug 25;51(15):7868-7881. doi: 10.1093/nar/gkad576.

Abstract

The functional impact of single nucleotide polymorphisms (SNPs) on translation has yet to be considered when prioritizing disease-causing SNPs from genome-wide association studies (GWAS). Here we apply machine learning models to genome-wide ribosome profiling data to predict SNP function by forecasting ribosome collisions during mRNA translation. SNPs causing remarkable ribosome occupancy changes are named RibOc-SNPs (Ribosome-Occupancy-SNPs). We found that disease-related SNPs tend to cause notable changes in ribosome occupancy, suggesting translational regulation as an essential pathogenesis step. Nucleotide conversions, such as 'G → T', 'T → G' and 'C → A', are enriched in RibOc-SNPs, with the most significant impact on ribosome occupancy, while 'A → G' (or 'A→ I' RNA editing) and 'G → A' are less deterministic. Among amino acid conversions, 'Glu → stop (codon)' shows the most significant enrichment in RibOc-SNPs. Interestingly, there is selection pressure on stop codons with a lower collision likelihood. RibOc-SNPs are enriched at the 5'-coding sequence regions, implying hot spots of translation initiation regulation. Strikingly, ∼22.1% of the RibOc-SNPs lead to opposite changes in ribosome occupancy on alternative transcript isoforms, suggesting that SNPs can amplify the differences between splicing isoforms by oppositely regulating their translation efficiency.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Codon, Terminator
  • Genome-Wide Association Study*
  • Machine Learning*
  • Polymorphism, Single Nucleotide*
  • Protein Biosynthesis*
  • RNA, Messenger* / genetics

Substances

  • Codon, Terminator
  • RNA, Messenger