DeepSE: Detecting super-enhancers among typical enhancers using only sequence feature embeddings

Genomics. 2021 Nov;113(6):4052-4060. doi: 10.1016/j.ygeno.2021.10.007. Epub 2021 Oct 16.

Abstract

Super-enhancer (SE) is a cluster of active typical enhancers (TE) with high levels of the Mediator complex, master transcriptional factors, and chromatin regulators. SEs play a key role in the control of cell identity and disease. Traditionally, scientists used a variety of high-throughput data of different transcriptional factors or chromatin marks to distinguish SEs from TEs. This kind of experimental methods are usually costly and time-consuming. In this paper, we proposed a model DeepSE, which is based on a deep convolutional neural network model, to distinguish the SEs from TEs. DeepSE represent the DNA sequences using the dna2vec feature embeddings. With only the DNA sequence information, DeepSE outperformed all state-of-the-art methods. In addition, DeepSE can be generalized well across different cell lines, which implied that cell-type specific SEs may share hidden sequence patterns across different cell lines. The source code and data are stored in GitHub (https://github.com/QiaoyingJi/DeepSE).

Keywords: Convolutional neural network; Super-enhancers; dna2vec.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Chromatin* / genetics
  • Enhancer Elements, Genetic*
  • Neural Networks, Computer
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • Chromatin
  • Transcription Factors