IRC-Fuse: improved and robust prediction of redox-sensitive cysteine by fusing of multiple feature representations

J Comput Aided Mol Des. 2021 Mar;35(3):315-323. doi: 10.1007/s10822-020-00368-0. Epub 2021 Jan 4.

Abstract

Redox-sensitive cysteine (RSC) thiol contributes to many biological processes. The identification of RSC plays an important role in clarifying some mechanisms of redox-sensitive factors; nonetheless, experimental investigation of RSCs is expensive and time-consuming. The computational approaches that quickly and accurately identify candidate RSCs using the sequence information are urgently needed. Herein, an improved and robust computational predictor named IRC-Fuse was developed to identify the RSC by fusing of multiple feature representations. To enhance the performance of our model, we integrated the probability scores evaluated by the random forest models implementing different encoding schemes. Cross-validation results exhibited that the IRC-Fuse achieved accuracy and AUC of 0.741 and 0.807, respectively. The IRC-Fuse outperformed exiting methods with improvement of 10% and 13% on accuracy and MCC, respectively, over independent test data. Comparative analysis suggested that the IRC-Fuse was more effective and promising than the existing predictors. For the convenience of experimental scientists, the IRC-Fuse online web server was implemented and publicly accessible at http://kurata14.bio.kyutech.ac.jp/IRC-Fuse/ .

Keywords: Feature selection; Machine learning; PseAAC; Redox-sensitive cysteine; Sequence profile information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Benchmarking / methods*
  • Computational Biology
  • Cysteine / chemistry*
  • Databases, Factual
  • Machine Learning
  • Models, Molecular
  • Oxidation-Reduction
  • Proteins / chemistry*
  • Sulfhydryl Compounds / chemistry

Substances

  • Proteins
  • Sulfhydryl Compounds
  • Cysteine