Improved discovery of RNA-binding protein binding sites in eCLIP data using DEWSeq

Nucleic Acids Res. 2024 Jan 11;52(1):e1. doi: 10.1093/nar/gkad998.

Abstract

Enhanced crosslinking and immunoprecipitation (eCLIP) sequencing is a method for transcriptome-wide detection of binding sites of RNA-binding proteins (RBPs). However, identified crosslink sites can deviate from experimentally established functional elements of even well-studied RBPs. Current peak-calling strategies result in low replication and high false positive rates. Here, we present the R/Bioconductor package DEWSeq that makes use of replicate information and size-matched input controls. We benchmarked DEWSeq on 107 RBPs for which both eCLIP data and RNA sequence motifs are available and were able to more than double the number of motif-containing binding regions relative to standard eCLIP processing. The improvement not only relates to the number of binding sites (3.1-fold with known motifs for RBFOX2), but also their subcellular localization (1.9-fold of mitochondrial genes for FASTKD2) and structural targets (2.2-fold increase of stem-loop regions for SLBP. On several orthogonal CLIP-seq datasets, DEWSeq recovers a larger number of motif-containing binding sites (3.3-fold). DEWSeq is a well-documented R/Bioconductor package, scalable to adequate numbers of replicates, and tends to substantially increase the proportion and total number of RBP binding sites containing biologically relevant features.

MeSH terms

  • Binding Sites
  • Immunoprecipitation
  • Protein Binding
  • RNA / chemistry
  • RNA-Binding Proteins* / genetics
  • RNA-Binding Proteins* / metabolism
  • Software*

Substances

  • RNA
  • RNA-Binding Proteins