iDHS-FFLG: Identifying DNase I Hypersensitive Sites by Feature Fusion and Local-Global Feature Extraction Network

Interdiscip Sci. 2023 Jun;15(2):155-170. doi: 10.1007/s12539-022-00538-8. Epub 2022 Sep 27.

Abstract

The DNase I hypersensitive sites (DHSs) are active regions on chromatin that have been found to be highly sensitive to DNase I. These regions contain various cis-regulatory elements, including promoters, enhancers and silencers. Accurate identification of DHSs helps researchers better understand the transcriptional machinery of DNA and deepen the knowledge of functional DNA elements in non-coding sequences. Researchers have developed many methods based on traditional experiments and machine learning to identify DHSs. However, low prediction accuracy and robustness limit their application in genetics research. In this paper, a novel computational approach based on deep learning is proposed by feature fusion and local-global feature extraction network to identify DHSs in mouse, named iDHS-FFLG. First of all, multiple binary features of nucleotides are fused to better express sequence information. Then, a network consisting of the convolutional neural network (CNN), bidirectional long short-term memory (BiLSTM) and self-attention mechanism is designed to extract local features and global contextual associations. In the end, the prediction module is applied to distinguish between DHSs and non-DHSs. The results of several experiments demonstrate the superior performances of iDHS-FFLG compared to the latest methods.

Keywords: DNase I hypersensitivity sites; Deep learning; Feature fusion; Local–global feature.

MeSH terms

  • Algorithms*
  • Animals
  • DNA
  • Deoxyribonuclease I* / genetics
  • Deoxyribonuclease I* / metabolism
  • Mice
  • Regulatory Sequences, Nucleic Acid
  • Sequence Analysis, DNA / methods

Substances

  • Deoxyribonuclease I
  • DNA