Enhancer can transcribe RNAs, however, most of them were neglected in traditional RNA-seq analysis workflow. Here, we developed a Pipeline for Enhancer Transcription (PET, http://fun-science.club/PET) for quantifying enhancer RNAs (eRNAs) from RNA-seq. By applying this pipeline on lung cancer samples and cell lines, we showed that the transcribed enhancers are enriched with histone marks and transcription factor motifs (JUNB, Hand1-Tcf3 and GATA4). By training a machine learning model, we demonstrate that enhancers can predict prognosis better than their nearby genes. Integrating the Hi-C, ChIP-seq and RNA-seq data, we observe that transcribed enhancers associate with cancer hallmarks or oncogenes, among which LcsMYC-1 (Lung cancer-specific MYC eRNA-1) potentially supports MYC expression. Surprisingly, a significant proportion of transcribed enhancers contain small protein-coding open reading frames (sORFs) and can be translated into microproteins. Our study provides a computational method for eRNA quantification and deepens our understandings of the DNA, RNA and protein nature of enhancers.
Keywords: eRNA pipeline; enhancer RNA; sORF; transcription factor.
© 2020 UICC.