DeepCap-Kcr: accurate identification and investigation of protein lysine crotonylation sites based on capsule network

Brief Bioinform. 2022 Jan 17;23(1):bbab492. doi: 10.1093/bib/bbab492.

Abstract

Lysine crotonylation (Kcr) is a posttranslational modification widely detected in histone and nonhistone proteins. It plays a vital role in human disease progression and various cellular processes, including cell cycle, cell organization, chromatin remodeling and a key mechanism to increase proteomic diversity. Thus, accurate information on such sites is beneficial for both drug development and basic research. Existing computational methods can be improved to more effectively identify Kcr sites in proteins. In this study, we proposed a deep learning model, DeepCap-Kcr, a capsule network (CapsNet) based on a convolutional neural network (CNN) and long short-term memory (LSTM) for robust prediction of Kcr sites on histone and nonhistone proteins (mammals). The proposed model outperformed the existing CNN architecture Deep-Kcr and other well-established tools in most cases and provided promising outcomes for practical use; in particular, the proposed model characterized the internal hierarchical representation as well as the important features from multiple levels of abstraction automatically learned from a small number of samples. The trained model was well generalized in other species (papaya). Moreover, we showed the features and properties generated by the internal capsule layer that can explore the internal data distribution related to biological significance (as a motif detector). The source code and data are freely available at https://github.com/Jhabindra-bioinfo/DeepCap-Kcr.

Keywords: Capsule Network; Deep Learning; Lysine crotonylation (Kcr); Motifs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Histones / metabolism
  • Humans
  • Lysine* / metabolism
  • Mammals / metabolism
  • Neural Networks, Computer
  • Protein Processing, Post-Translational*
  • Proteomics*
  • Software

Substances

  • Histones
  • Lysine