Discrimination of acidic and alkaline enzyme using Chou's pseudo amino acid composition in conjunction with probabilistic neural network model

J Theor Biol. 2015 Jan 21:365:197-203. doi: 10.1016/j.jtbi.2014.10.014. Epub 2014 Oct 22.

Abstract

Enzyme catalysis is one of the most essential and striking processes among of all the complex processes that have evolved in living organisms. Enzymes are biological catalysts, which play a significant role in industrial applications as well as in medical areas, due to profound specificity, selectivity and catalytic efficiency. Refining catalytic efficiency of enzymes has become the most challenging job of enzyme engineering, into acidic and alkaline. Discrimination of acidic and alkaline enzymes through experimental approaches is difficult, sometimes impossible due to lack of established structures. Therefore, it is highly desirable to develop a computational model for discriminating acidic and alkaline enzymes from primary sequences. In this study, we have developed a robust, accurate and high throughput computational model using two discrete sample representation methods Pseudo amino acid composition (PseAAC) and split amino acid composition. Various classification algorithms including probabilistic neural network (PNN), K-nearest neighbor, decision tree, multi-layer perceptron and support vector machine are applied to predict acidic and alkaline with high accuracy. 10-fold cross validation test and several statistical measures namely, accuracy, F-measure, and area under ROC are used to evaluate the performance of the proposed model. The performance of the model is examined using two benchmark datasets to demonstrate the effectiveness of the model. The empirical results show that the performance of PNN in conjunction with PseAAC is quite promising compared to existing approaches in the literature so for. It has achieved 96.3% accuracy on dataset1 and 99.2% on dataset2. It is ascertained that the proposed model might be useful for basic research and drug related application areas.

Keywords: DT; K-Nearest neighbor; PNN; SAAC; SVM.

MeSH terms

  • Algorithms*
  • Databases, Protein
  • Enzymes / chemistry*
  • Enzymes / genetics
  • Models, Chemical*
  • Neural Networks, Computer*
  • Protein Folding*
  • Sequence Analysis, Protein / methods*

Substances

  • Enzymes