Position-specific analysis and prediction for protein lysine acetylation based on multiple features

PLoS One. 2012;7(11):e49108. doi: 10.1371/journal.pone.0049108. Epub 2012 Nov 16.

Abstract

Protein lysine acetylation is a type of reversible post-translational modification that plays a vital role in many cellular processes, such as transcriptional regulation, apoptosis and cytokine signaling. To fully decipher the molecular mechanisms of acetylation-related biological processes, an initial but crucial step is the recognition of acetylated substrates and the corresponding acetylation sites. In this study, we developed a position-specific method named PSKAcePred for lysine acetylation prediction based on support vector machines. The residues around the acetylation sites were selected or excluded based on their entropy values. We incorporated features of amino acid composition information, evolutionary similarity and physicochemical properties to predict lysine acetylation sites. The prediction model achieved an accuracy of 79.84% and a Matthews correlation coefficient of 59.72% using the 10-fold cross-validation on balanced positive and negative samples. A feature analysis showed that all features applied in this method contributed to the acetylation process. A position-specific analysis showed that the features derived from the critical neighboring residues contributed profoundly to the acetylation site determination. The detailed analysis in this paper can help us to understand more of the acetylation mechanism and can provide guidance for the related experimental validation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acetylation
  • Amino Acid Sequence
  • Binding Sites
  • Chemical Phenomena
  • Computational Biology / methods*
  • Conserved Sequence
  • Evolution, Molecular
  • Lysine / metabolism*
  • Protein Processing, Post-Translational*
  • Proteins / chemistry*
  • Proteins / metabolism*
  • User-Computer Interface

Substances

  • Proteins
  • Lysine

Grants and funding

This work was supported by the National Natural Science Foundation of China (20605010 and 21175064) and the Program for New Century Excellent Talents in University (NCET-11-1002). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.