HEMEsPred: Structure-Based Ligand-Specific Heme Binding Residues Prediction by Using Fast-Adaptive Ensemble Learning Scheme

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):147-156. doi: 10.1109/TCBB.2016.2615010. Epub 2016 Oct 4.

Abstract

Heme is an essential biomolecule that widely exists in numerous extant organisms. Accurately identifying heme binding residues (HEMEs) is of great importance in disease progression and drug development. In this study, a novel predictor named HEMEsPred was proposed for predicting HEMEs. First, several sequence- and structure-based features, including amino acid composition, motifs, surface preferences, and secondary structure, were collected to construct feature matrices. Second, a novel fast-adaptive ensemble learning scheme was designed to overcome the serious class-imbalance problem as well as to enhance the prediction performance. Third, we further developed ligand-specific models considering that different heme ligands varied significantly in their roles, sizes, and distributions. Statistical test proved the effectiveness of ligand-specific models. Experimental results on benchmark datasets demonstrated good robustness of our proposed method. Furthermore, our method also showed good generalization capability and outperformed many state-of-art predictors on two independent testing datasets. HEMEsPred web server was available at http://www.inforstation.com/HEMEsPred/ for free academic use.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites*
  • Computational Biology / methods*
  • Databases, Protein
  • Heme / chemistry*
  • Heme / genetics
  • Heme / metabolism*
  • Machine Learning*
  • Models, Molecular
  • Protein Binding / genetics*
  • Protein Structure, Secondary
  • Software

Substances

  • Heme