Analysis and Prediction of Myristoylation Sites Using the mRMR Method, the IFS Method and an Extreme Learning Machine Algorithm

Comb Chem High Throughput Screen. 2017;20(2):96-106. doi: 10.2174/1386207319666161220114424.

Abstract

Background: Myristoylation is an important hydrophobic post-translational modification that is covalently bound to the amino group of Gly residues on the N-terminus of proteins. The many diverse functions of myristoylation on proteins, such as membrane targeting, signal pathway regulation and apoptosis, are largely due to the lipid modification, whereas abnormal or irregular myristoylation on proteins can lead to several pathological changes in the cell.

Objective: To better understand the function of myristoylated sites and to correctly identify them in protein sequences, this study conducted a novel computational investigation on identifying myristoylation sites in protein sequences.

Materials and methods: A training dataset with 196 positive and 84 negative peptide segments were obtained. Four types of features derived from the peptide segments following the myristoylation sites were used to specify myristoylatedand non-myristoylated sites. Then, feature selection methods including maximum relevance and minimum redundancy (mRMR), incremental feature selection (IFS), and a machine learning algorithm (extreme learning machine method) were adopted to extract optimal features for the algorithm to identify myristoylation sites in protein sequences, thereby building an optimal prediction model.

Results: As a result, 41 key features were extracted and used to build an optimal prediction model. The effectiveness of the optimal prediction model was further validated by its performance on a test dataset. Furthermore, detailed analyses were also performed on the extracted 41 features to gain insight into the mechanism of myristoylation modification.

Conclusion: This study provided a new computational method for identifying myristoylation sites in protein sequences. We believe that it can be a useful tool to predict myristoylation sites from protein sequences.

Keywords: Post-translational modification; extreme learning machine; incremental feature selection; minimum redundancy maximum relevance; modified glycine residue; myristoylation site prediction.

MeSH terms

  • Algorithms
  • Binding Sites
  • Computational Biology / methods*
  • Glycine / metabolism*
  • Machine Learning
  • Myristic Acid / metabolism*
  • Protein Processing, Post-Translational*

Substances

  • Myristic Acid
  • Glycine