Computational analysis of CLIP-seq data

Methods. 2017 Apr 15:118-119:60-72. doi: 10.1016/j.ymeth.2017.02.006. Epub 2017 Feb 22.

Abstract

CLIP-seq experiments are currently the most important means for determining the binding sites of RNA binding proteins on a genome-wide level. The computational analysis can be divided into three steps. In the first pre-processing stage, raw reads have to be trimmed and mapped to the genome. This step has to be specifically adapted for each CLIP-seq protocol. The next step is peak calling, which is required to remove unspecific signals and to determine bona fide protein binding sites on target RNAs. Here, both protocol-specific approaches as well as generic peak callers are available. Despite some peak callers being more widely used, each peak caller has its specific assets and drawbacks, and it might be advantageous to compare the results of several methods. Although peak calling is often the final step in many CLIP-seq publications, an important follow-up task is the determination of binding models from CLIP-seq data. This is central because CLIP-seq experiments are highly dependent on the transcriptional state of the cell in which the experiment was performed. Thus, relying solely on binding sites determined by CLIP-seq from different cells or conditions can lead to a high false negative rate. This shortcoming can, however, be circumvented by applying models that predict additional putative binding sites.

Keywords: CLIP-seq data analysis; Peak calling; RBP binding models; RBP binding site prediction.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antibodies / chemistry
  • Base Sequence
  • Binding Sites
  • Cell Line
  • Gene Library
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Immunoprecipitation / methods*
  • Nucleic Acid Conformation
  • Protein Binding
  • RNA / chemistry*
  • RNA / genetics
  • RNA / metabolism
  • RNA-Binding Proteins / genetics*
  • RNA-Binding Proteins / metabolism
  • Sequence Analysis, RNA / methods
  • Sequence Analysis, RNA / statistics & numerical data*
  • Software*
  • Transcriptome

Substances

  • Antibodies
  • RNA-Binding Proteins
  • RNA