iDRBP-ECHF: Identifying DNA- and RNA-binding proteins based on extensible cubic hybrid framework

Comput Biol Med. 2022 Oct:149:105940. doi: 10.1016/j.compbiomed.2022.105940. Epub 2022 Aug 13.

Abstract

Proteins interact with nucleic acids to regulate the life activities of organisms. Therefore, how to accurately and efficiently identify nucleic acid-binding proteins (NABPs) is particularly significant. Some sequence-based computational methods have been proposed to identify DNA- and RNA-binding proteins in previous studies. However, the benchmark datasets used by these methods ignore the proportion of NABPs in the real world, and some integration methods only integrate traditional machine learning algorithms, resulting in limited prediction performance. In this study, we proposed a sequence-based method called iDRBP-ECHF to predict the DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs). We constructed a benchmark dataset by considering the proportion of positive and negative samples in the real world, and used down-sampling to generate three relatively balanced datasets to train the iDRBP-ECHF. In addition, we incorporated the deep learning algorithms into the framework to obtain a more compact high-level feature representation of the input data. The results on two independent datasets show that it achieves the most advanced performance and is superior to the other existing sequence-based DBP and RBP prediction methods. In addition, we set up a webserver iDRBP-ECHF, which can be accessed at http://bliulab.net/iDRBP-ECHF.

Keywords: DNA- and RNA-binding proteins identification; Extensible cubic hybrid framework; Machine learning; Multi-label learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites
  • Computational Biology / methods
  • DNA / genetics
  • DNA-Binding Proteins / genetics
  • DNA-Binding Proteins / metabolism
  • Machine Learning*
  • RNA-Binding Proteins* / chemistry
  • RNA-Binding Proteins* / genetics
  • RNA-Binding Proteins* / metabolism

Substances

  • DNA-Binding Proteins
  • RNA-Binding Proteins
  • DNA