GlyStruct: glycation prediction using structural properties of amino acid residues

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):547. doi: 10.1186/s12859-018-2547-x.

Abstract

Background: Glycation is a one of the post-translational modifications (PTM) where sugar molecules and residues in protein sequences are covalently bonded. It has become one of the clinically important PTM in recent times attributed to many chronic and age related complications. Being a non-enzymatic reaction, it is a great challenge when it comes to its prediction due to the lack of significant bias in the sequence motifs.

Results: We developed a classifier, GlyStruct based on support vector machine, to predict glycated and non-glycated lysine residues using structural properties of amino acid residues. The features used were secondary structure, accessible surface area and the local backbone torsion angles. For this work, a benchmark dataset was extracted containing 235 glycated and 303 non-glycated lysine residues. GlyStruct demonstrated improved performance of approximately 10% in comparison to benchmark method of Gly-PseAAC. The performance for GlyStruct on the metrics, sensitivity, specificity, accuracy and Mathew's correlation coefficient were 0.7013, 0.7989, 0.7562, and 0.5065, respectively for 10-fold cross-validation.

Conclusion: Glycation has emerged to be one of the clinically important PTM of proteins in recent times. Therefore, the development of computational tools become necessary to predict glycation, which could help medical professionals administer drugs and manage patients more effectively. The proposed predictor manages to classify glycated and non-glycated lysine residues with promising results consistently on various cross-validation schemes and outperforms other state of the art methods.

Keywords: Amino acids; Lysine glycation; Post-translational modification; Prediction; Protein sequences; Support vector machine.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Area Under Curve
  • Benchmarking
  • Computational Biology / methods*
  • Glycosylation
  • Humans
  • Peptides / chemistry
  • Support Vector Machine

Substances

  • Amino Acids
  • Peptides