Towards a better understanding of TF-DNA binding prediction from genomic features

Comput Biol Med. 2022 Oct:149:105993. doi: 10.1016/j.compbiomed.2022.105993. Epub 2022 Aug 17.

Abstract

Transcription factors (TFs) can regulate gene expression by recognizing specific cis-regulatory elements in DNA sequences. TF-DNA binding prediction has become a fundamental step in comprehending the underlying cis-regulation mechanism. Since a particular genome region is bound depending on multiple features, such as the arrangement of nucleotides, DNA shape, and an epigenetic mechanism, many researchers attempt to develop computational methods to predict TF binding sites (TFBSs) based on various genomic features. This paper provides a comprehensive compendium to better understand TF-DNA binding from genomic features. We first summarize the commonly used datasets and data processing manners. Subsequently, we classify current deep learning methods in TFBS prediction according to their utilized genomic features and analyze each technique's merit and weakness. Furthermore, we illustrate the functional consequences characterization of TF-DNA binding by prioritizing noncoding variants in identified motif instances. Finally, the challenges and opportunities of deep learning in TF-DNA binding prediction are discussed. This survey can bring valuable insights for researchers to study the modeling of TF-DNA binding.

Keywords: Deep learning; Genomic features; Motif discovery; Noncoding variant; TF-DNA binding.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Computational Biology* / methods
  • DNA / chemistry
  • DNA / genetics
  • Genomics*
  • Nucleotides / metabolism
  • Protein Binding
  • Transcription Factors / chemistry
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • Nucleotides
  • Transcription Factors
  • DNA