MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles

J Theor Biol. 2018 Jan 21:437:9-16. doi: 10.1016/j.jtbi.2017.10.015. Epub 2017 Oct 16.

Abstract

Motivation: Intrinsically Disordered Proteins (IDPs) lack stable tertiary structure and they actively participate in performing various biological functions. These IDPs expose short binding regions called Molecular Recognition Features (MoRFs) that permit interaction with structured protein regions. Upon interaction they undergo a disorder-to-order transition as a result of which their functionality arises. Predicting these MoRFs in disordered protein sequences is a challenging task.

Method: In this study, we present MoRFpred-plus, an improved predictor over our previous proposed predictor to identify MoRFs in disordered protein sequences. Two separate independent propensity scores are computed via incorporating physicochemical properties and HMM profiles, these scores are combined to predict final MoRF propensity score for a given residue. The first score reflects the characteristics of a query residue to be part of MoRF region based on the composition and similarity of assumed MoRF and flank regions. The second score reflects the characteristics of a query residue to be part of MoRF region based on the properties of flanks associated around the given residue in the query protein sequence. The propensity scores are processed and common averaging is applied to generate the final prediction score of MoRFpred-plus.

Results: Performance of the proposed predictor is compared with available MoRF predictors, MoRFchibi, MoRFpred, and ANCHOR. Using previously collected training and test sets used to evaluate the mentioned predictors, the proposed predictor outperforms these predictors and generates lower false positive rate. In addition, MoRFpred-plus is a downloadable predictor, which makes it useful as it can be used as input to other computational tools.

Availability: https://github.com/roneshsharma/MoRFpred-plus/wiki/MoRFpred-plus:-Download.

Keywords: Hidden Markov model; Intrinsically disordered proteins; Molecular recognition feature; Support vector machine.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Binding Sites / genetics
  • Computational Biology / methods*
  • Intrinsically Disordered Proteins / chemistry*
  • Intrinsically Disordered Proteins / genetics
  • Intrinsically Disordered Proteins / metabolism
  • Models, Theoretical
  • Protein Binding
  • Protein Domains*
  • Support Vector Machine*

Substances

  • Intrinsically Disordered Proteins