ChromDL: a next-generation regulatory DNA classifier

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i377-i385. doi: 10.1093/bioinformatics/btad217.

Abstract

Motivation: Predicting the regulatory function of non-coding DNA using only the DNA sequence continues to be a major challenge in genomics. With the advent of improved optimization algorithms, faster GPU speeds, and more intricate machine-learning libraries, hybrid convolutional and recurrent neural network architectures can be constructed and applied to extract crucial information from non-coding DNA.

Results: Using a comparative analysis of the performance of thousands of Deep Learning architectures, we developed ChromDL, a neural network architecture combining bidirectional gated recurrent units, convolutional neural networks, and bidirectional long short-term memory units, which significantly improves upon a range of prediction metrics compared to its predecessors in transcription factor binding site, histone modification, and DNase-I hyper-sensitive site detection. Combined with a secondary model, it can be utilized for accurate classification of gene regulatory elements. The model can also detect weak transcription factor binding as compared to previously developed methods and has the potential to help delineate transcription factor binding motif specificities.

Availability and implementation: The ChromDL source code can be found at https://github.com/chrishil1/ChromDL.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Algorithms*
  • Benchmarking*
  • DNA
  • Deoxyribonuclease I
  • Transcription Factors

Substances

  • DNA
  • Deoxyribonuclease I
  • Transcription Factors