microTSS: accurate microRNA transcription start site identification reveals a significant number of divergent pri-miRNAs

Nat Commun. 2014 Dec 10:5:5700. doi: 10.1038/ncomms6700.

Abstract

A large fraction of microRNAs (miRNAs) are derived from intergenic non-coding loci and the identification of their promoters remains 'elusive'. Here, we present microTSS, a machine-learning algorithm that provides highly accurate, single-nucleotide resolution predictions for intergenic miRNA transcription start sites (TSSs). MicroTSS integrates high-resolution RNA-sequencing data with active transcription marks derived from chromatin immunoprecipitation and DNase-sequencing to enable the characterization of tissue-specific promoters. MicroTSS is validated with a specifically designed Drosha-null/conditional-null mouse model, generated using the conditional by inversion (COIN) methodology. Analyses of global run-on sequencing data revealed numerous pri-miRNAs in human and mouse either originating from divergent transcription at promoters of active genes or partially overlapping with annotated long non-coding RNAs. MicroTSS is readily applicable to any cell or tissue samples and constitutes the missing part towards integrating the regulation of miRNA transcription into the modelling of tissue-specific regulatory networks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Chromatin Immunoprecipitation
  • Cluster Analysis
  • Computational Biology
  • Embryonic Stem Cells / cytology
  • Humans
  • Mice
  • Mice, Transgenic
  • MicroRNAs / genetics*
  • Models, Genetic
  • Oligonucleotides, Antisense / genetics
  • Promoter Regions, Genetic
  • RNA Polymerase II / metabolism
  • RNA, Messenger / metabolism
  • RNA, Untranslated / metabolism
  • Ribonuclease III / genetics
  • Sequence Analysis, RNA
  • Support Vector Machine
  • Transcription Initiation Site*

Substances

  • MicroRNAs
  • Oligonucleotides, Antisense
  • RNA, Messenger
  • RNA, Untranslated
  • RNA Polymerase II
  • Drosha protein, mouse
  • Ribonuclease III

Associated data

  • GEO/GSE55735