Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy

PLoS One. 2016 Sep 23;11(9):e0163274. doi: 10.1371/journal.pone.0163274. eCollection 2016.

Abstract

Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

Grants and funding

This work was funded by the National Natural Science Foundation of China (http://www.nsfc.gov.cn; Nos. 61174044, 61473335, and 61174218), Natural Science Foundation of Shandong Province of China (http://www.sdnsf.gov.cn/portal/; No. ZR2015PG004), and the Doctoral Foundation of University of Jinan (http://www.ujn.edu.cn/; No. XBS1334). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.