Accurately Predicting Glutarylation Sites Using Sequential Bi-Peptide-Based Evolutionary Features

Genes (Basel). 2020 Aug 31;11(9):1023. doi: 10.3390/genes11091023.

Abstract

Post Translational Modification (PTM) is defined as the alteration of protein sequence upon interaction with different macromolecules after the translation process. Glutarylation is considered one of the most important PTMs, which is associated with a wide range of cellular functioning, including metabolism, translation, and specified separate subcellular localizations. During the past few years, a wide range of computational approaches has been proposed to predict Glutarylation sites. However, despite all the efforts that have been made so far, the prediction performance of the Glutarylation sites has remained limited. One of the main challenges to tackle this problem is to extract features with significant discriminatory information. To address this issue, we propose a new machine learning method called BiPepGlut using the concept of a bi-peptide-based evolutionary method for feature extraction. To build this model, we also use the Extra-Trees (ET) classifier for the classification purpose, which, to the best of our knowledge, has never been used for this task. Our results demonstrate BiPepGlut is able to significantly outperform previously proposed models to tackle this problem. BiPepGlut achieves 92.0%, 84.8%, 95.6%, 0.82, and 0.88 in accuracy, sensitivity, specificity, Matthew's Correlation Coefficient, and F1-score, respectively. BiPepGlut is implemented as a publicly available online predictor.

Keywords: bi-peptide evolutionary features; extra-trees classifier; lysine Glutarylation; machine learning; post-translational modification.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Computational Biology
  • Evolution, Molecular*
  • Glutarates / chemistry*
  • Glutarates / metabolism
  • Lysine / chemistry*
  • Lysine / metabolism
  • Machine Learning
  • Mice
  • Mycobacterium tuberculosis / growth & development
  • Mycobacterium tuberculosis / metabolism*
  • Peptide Fragments / chemistry*
  • Peptide Fragments / metabolism
  • Protein Processing, Post-Translational*
  • Proteins / chemistry*
  • Proteins / metabolism
  • Support Vector Machine

Substances

  • Glutarates
  • Peptide Fragments
  • Proteins
  • glutaric acid
  • Lysine