Using the SMOTE technique and hybrid features to predict the types of ion channel-targeted conotoxins

J Theor Biol. 2016 Aug 21:403:75-84. doi: 10.1016/j.jtbi.2016.04.034. Epub 2016 Apr 30.

Abstract

Conotoxins targeting different ion channels play distinct physiological functions and therapeutic potentials in organisms. Accurate identification of types of ion channel-targeted conotoxins will provide significant clues to reveal the physiological mechanism and pharmacological therapeutic potential of conotoxins. In this study, a random forest based predictor called ICTCPred for the types of ion channel-targeted conotoxin prediction is proposed with hybrid features incorporating CTD (Composition, Transition, and Distribution), g-Gap DC (g-Gap Dipeptide Composition), PP (Physicochemical Properties), and SSI (Secondary Structure Information). To deal with the imbalanced benchmark dataset, the SMOTE Technique (Synthetic Minority Over-sampling Technique) is applied. Based on the above-mentioned individual feature spaces, the average accuracy of ICTCPred lies in the range of 0.729-0.886, indicating the discriminative power of these features. In addition, ICTCPred yields the highest average accuracy of 0.895 using the hybrid feature space of CTD, g-Gap DC, PP and SSI. The Relief-IFS (Incremental Feature Selection) method is adopted to further improve the prediction performance of ICTCPred. Based on the training dataset, ICTCPred achieves satisfactory performance with an average accuracy of 0.910. To evaluate the prediction performance objectively, ICTCPred is compared with previous studies on the same independent testing dataset. Encouragingly, our proposed method performs better than previous studies to identify types of ion channel-targeted conotoxins, with the highest sensitivity of 0.919 for Na(+)-targeted conotoxins, the highest sensitivity of 1 for K(+)-targeted conotoxins, and the highest sensitivity of 1 for Ca(2+)-targeted conotoxins. It is anticipated that ICTCPred can be a potential candidate for the ion channel-targeted conotoxin prediction.

Keywords: Hybrid Features; Ion Channel-targeted Conotoxins; Relief; SMOTE.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Cluster Analysis
  • Computational Biology / methods*
  • Conotoxins / chemistry
  • Conotoxins / pharmacology*
  • Databases, Protein
  • Ion Channels / metabolism*
  • Peptides / chemistry

Substances

  • Amino Acids
  • Conotoxins
  • Ion Channels
  • Peptides