A comprehensive revisit of the machine-learning tools developed for the identification of enhancers in the human genome

Proteomics. 2023 Jul;23(13-14):e2200409. doi: 10.1002/pmic.202200409. Epub 2023 Jun 7.

Abstract

Enhancers are non-coding DNA elements that play a crucial role in enhancing the transcription rate of a specific gene in the genome. Experiments for identifying enhancers can be restricted by their conditions and involve complicated, time-consuming, laborious, and costly steps. To overcome these challenges, computational platforms have been developed to complement experimental methods that enable high-throughput identification of enhancers. Over the last few years, the development of various enhancer computational tools has resulted in significant progress in predicting putative enhancers. Thus, researchers are now able to use a variety of strategies to enhance and advance enhancer study. In this review, an overview of machine learning (ML)-based prediction methods for enhancer identification and related databases has been provided. The existing enhancer-prediction methods have also been reviewed regarding their algorithms, feature selection processes, validation techniques, and software utility. In addition, the advantages and drawbacks of these ML approaches and guidelines for developing bioinformatic tools have been highlighted for a more efficient enhancer prediction. This review will serve as a useful resource for experimentalists in selecting the appropriate ML tool for their study, and for bioinformaticians in developing more accurate and advanced ML-based predictors.

Keywords: bioinformatics; deep learning; enhancer; machine learning; sequence analysis; webserver.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Enhancer Elements, Genetic*
  • Genome, Human*
  • Humans
  • Machine Learning