iProtease-PseAAC(2L): A two-layer predictor for identifying proteases and their types using Chou's 5-step-rule and general PseAAC

Yaser Daanial Khan; Najm Amin; Waqar Hussain; Nouman Rasool; Sher Afzal Khan; Kuo-Chen Chou

doi:10.1016/j.ab.2019.113477

iProtease-PseAAC(2L): A two-layer predictor for identifying proteases and their types using Chou's 5-step-rule and general PseAAC

Anal Biochem. 2020 Jan 1:588:113477. doi: 10.1016/j.ab.2019.113477. Epub 2019 Oct 22.

Authors

Yaser Daanial Khan¹, Najm Amin², Waqar Hussain³, Nouman Rasool⁴, Sher Afzal Khan⁵, Kuo-Chen Chou⁶

Affiliations

¹ Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, 54770, Pakistan. Electronic address: yaser.khan@umt.edu.pk.
² Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, 54770, Pakistan.
³ National Center of Artificial Intelligence, Punjab University College of Information Technology, University of the Punjab, Lahore, Pakistan.
⁴ Dr Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, 75270, Pakistan.
⁵ Faculty of Computing and Information Technology in Rabigh, Jeddah, 21577, Saudi Arabia; Abdul Wali Khan University, Department of Computer Sciences, Mardan, Pakistan.
⁶ Gordon Life Science Institute, Boston, MA, 02478, USA.

PMID: 31654612
DOI: 10.1016/j.ab.2019.113477

Abstract

Proteases are a type of enzymes, which perform the process of proteolysis. Proteolysis normally refers to protein and peptide degradation which is crucial for the survival, growth and wellbeing of a cell. Moreover, proteases have a strong association with therapeutics and drug development. The proteases are classified into five different types according to their nature and physiochemical characteristics. Mostly the methods used to differentiate protease from other proteins and identify their class requires a clinical test which is usually time-consuming and operator dependent. Herein, we report a classifier named iProtease-PseAAC (2L) for identifying proteases and their classes. The predictor is developed employing the flow of 5-step rule, initiating from the collection of benchmark dataset and terminating at the development of predictor. Rigorous verification and validation tests are performed and metrics are collected to calculate the authenticity of the trained model. The self-consistency validation gives the 98.32% accuracy, for cross-validation the accuracy is 90.71% and jackknife gives 96.07% accuracy. The average accuracy for level-2 i.e. protease classification is 95.77%. Based on the above-mentioned results, it is concluded that iProtease-PseAAC (2L) has the great ability to identify the proteases and their classes using a given protein sequence.

Keywords: 5-step rule; Prediction; Protease; PseAAC; Statistical moments.

MeSH terms

Algorithms*
Computational Biology / methods*
Databases, Protein
Peptide Hydrolases / classification*
Proteins / classification*
Software*

Substances

Proteins
Peptide Hydrolases