FCNGRU: Locating Transcription Factor Binding Sites by Combing Fully Convolutional Neural Network With Gated Recurrent Unit

IEEE J Biomed Health Inform. 2022 Apr;26(4):1883-1890. doi: 10.1109/JBHI.2021.3117616. Epub 2022 Apr 14.

Abstract

Deciphering the relationship between transcription factors (TFs) and DNA sequences is very helpful for computational inference of gene regulation and a comprehensive understanding of gene regulation mechanisms. Transcription factor binding sites (TFBSs) are specific DNA short sequences that play a pivotal role in controlling gene expression through interaction with TF proteins. Although recently many computational and deep learning methods have been proposed to predict TFBSs aiming to predict sequence specificity of TF-DNA binding, there is still a lack of effective methods to directly locate TFBSs. In order to address this problem, we propose FCNGRU combing a fully convolutional neural network (FCN) with the gated recurrent unit (GRU) to directly locate TFBSs in this paper. Furthermore, we present a two-task framework (FCNGRU-double): one is a classification task at nucleotide level which predicts the probability of each nucleotide and locates TFBSs, and the other is a regression task at sequence level which predicts the intensity of each sequence. A series of experiments are conducted on 45 in-vitro datasets collected from the UniPROBE database derived from universal protein binding microarrays (uPBMs). Compared with competing methods, FCNGRU-double achieves much better results on these datasets. Moreover, FCNGRU-double has an advantage over a single-task framework, FCNGRU-single, which only contains the branch of locating TFBSs. In addition, we combine with in vivo datasets to make a further analysis and discussion.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites / genetics
  • Computational Biology* / methods
  • DNA / chemistry
  • Humans
  • Neural Networks, Computer*
  • Nucleotides / metabolism
  • Protein Binding
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • Nucleotides
  • Transcription Factors
  • DNA