Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning

Genome Biol. 2023 Aug 4;24(1):180. doi: 10.1186/s13059-023-03015-7.

Abstract

We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.

Keywords: CLIP-seq; Computational biology; Deep learning; Protein-RNA interaction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Base Sequence*
  • Bias
  • Binding Sites
  • Computer Simulation*
  • Consensus Sequence
  • Datasets as Topic
  • Deep Learning*
  • Humans
  • Internet
  • Mutation
  • Nucleotide Motifs
  • Nucleotides / metabolism
  • RNA Splice Sites
  • RNA* / chemistry
  • RNA* / genetics
  • RNA* / metabolism
  • RNA, Messenger / chemistry
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • RNA, Viral / chemistry
  • RNA, Viral / genetics
  • RNA, Viral / metabolism
  • RNA-Binding Proteins* / chemistry
  • RNA-Binding Proteins* / metabolism

Substances

  • Nucleotides
  • RNA
  • RNA Splice Sites
  • RNA, Messenger
  • RNA, Viral
  • RNA-Binding Proteins