Integration of ENCODE RNAseq and eCLIP Data Sets

Methods Mol Biol. 2018:1720:111-129. doi: 10.1007/978-1-4939-7540-2_8.

Abstract

During the last decade, the study of mRNA decay has largely benefited from an increasing number of high-throughput assays that emerged from developments in next generation sequencing (NGS) technologies as well as mass spectrometry. While assay-specific data analysis is often reported and software made available many researchers struggle with the overwhelming challenge of integrating data from diverse assays, different sources, and of different formats.We here use Python, R, and bash to analyze and integrate RNAseq and eCLIP data publicly available from ENCODE. Annotation is performed with biomart, motif analysis with MEME and finally a functional enrichment analysis using DAVID. This analysis is centered on KHSRP eCLIP data from K562 cell as well as RNAseq data from KHSRP knockdown and respective mock controls.

Keywords: Bioinformatics; ENCODE; Python; R; RNAseq; bash; eCLIP; mRNA decay.

MeSH terms

  • Computational Biology / instrumentation
  • Computational Biology / methods*
  • Databases, Nucleic Acid*
  • Datasets as Topic
  • Gene Knockdown Techniques
  • High-Throughput Nucleotide Sequencing
  • Humans
  • K562 Cells
  • RNA Stability*
  • RNA, Messenger / genetics*
  • RNA, Messenger / metabolism
  • RNA-Binding Proteins / genetics
  • Sequence Analysis, RNA
  • Software*
  • Trans-Activators / genetics

Substances

  • KHSRP protein, human
  • RNA, Messenger
  • RNA-Binding Proteins
  • Trans-Activators