DeepDRBP-2L: A New Genome Annotation Predictor for Identifying DNA-Binding Proteins and RNA-Binding Proteins Using Convolutional Neural Network and Long Short-Term Memory

IEEE/ACM Trans Comput Biol Bioinform. 2021 Jul-Aug;18(4):1451-1463. doi: 10.1109/TCBB.2019.2952338. Epub 2021 Aug 6.

Abstract

DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs) are two kinds of crucial proteins, which are associated with various cellule activities and some important diseases. Accurate identification of DBPs and RBPs facilitate both theoretical research and real world application. Existing sequence-based DBP predictors can accurately identify DBPs but incorrectly predict many RBPs as DBPs, and vice versa, resulting in low prediction precision. Moreover, some proteins (DRBPs) interacting with both DNA and RNA play important roles in gene expression and cannot be identified by existing computational methods. In this study, a two-level predictor named DeepDRBP-2L was proposed by combining Convolutional Neural Network (CNN) and the Long Short-Term Memory (LSTM). It is the first computational method that is able to identify DBPs, RBPs and DRBPs. Rigorous cross-validations and independent tests showed that DeepDRBP-2L is able to overcome the shortcoming of the existing methods and can go one further step to identify DRBPs. Application of DeepDRBP-2L to tomato genome further demonstrated its performance. The webserver of DeepDRBP-2L is freely available at http://bliulab.net/DeepDRBP-2L.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • DNA-Binding Proteins* / chemistry
  • DNA-Binding Proteins* / genetics
  • DNA-Binding Proteins* / metabolism
  • Genomics / methods*
  • Molecular Sequence Annotation / methods*
  • Neural Networks, Computer*
  • RNA-Binding Proteins* / chemistry
  • RNA-Binding Proteins* / genetics
  • RNA-Binding Proteins* / metabolism

Substances

  • DNA-Binding Proteins
  • RNA-Binding Proteins