iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into chou's pseudo amino acid composition

J Theor Biol. 2018 Apr 7:442:11-21. doi: 10.1016/j.jtbi.2018.01.008. Epub 2018 Jan 11.

Abstract

Membrane proteins execute significant roles in cellular processes of living organisms, ranging from cell signaling to cell adhesion. As a major part of a cell, the identification of membrane proteins and their functional types become a challenging job in the field of bioinformatics and proteomics from last few decades. Traditional experimental procedures are slightly applicable due to lack of recognized structures, enormous time and space. In this regard, the demand for fast, accurate and intelligent computational method is increased day by day. In this paper, a two-tier intelligent automated predictor has been developed called iMem-2LSAAC, which classifies protein sequence as membrane or non-membrane in first-tier (phase1) and in case of membrane the second-tier (phase2) identifies functional types of membrane protein. Quantitative attributes were extracted from protein sequences by applying three discrete features extraction schemes namely amino acid composition, pseudo amino acid composition and split amino acid composition (SAAC). Various learning algorithms were investigated by using jackknife test to select the best one for predictor. Experimental results exhibited that the highest predictive outcomes were yielded by SVM in conjunction with SAAC feature space on all examined datasets. The true classification rate of iMem-2LSAAC predictor is significantly higher than that of other state-of- the- art methods so far in the literature. Finally, it is expected that the proposed predictor will provide a solid framework for the development of pharmaceutical drug discovery and might be useful for researchers and academia.

Keywords: Membrane proteins; PseAAC; SAAC; SVM.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Databases, Protein
  • Membrane Proteins / genetics
  • Membrane Proteins / metabolism*
  • Neural Networks, Computer*
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods
  • Support Vector Machine*

Substances

  • Membrane Proteins