Machine learning-based antioxidant protein identification model: Progress and evaluation

J Cell Biochem. 2023 Nov;124(11):1825-1834. doi: 10.1002/jcb.30491. Epub 2023 Oct 25.

Abstract

Efficient and accurate identification of antioxidant proteins is of great significance. In recent years, many models for identifying antioxidant proteins have been proposed, but the low sensitivity and high dimensionality of the models are common problems. The generalization ability of the model needs to be improved. Researchers have tried different feature extraction algorithms and feature selection algorithms to obtain the most effective feature combination and have chosen more appropriate classification algorithms and tools to improve model performance. In this article, we systematically reviewed the data set of the most frequently used antioxidant proteins and the method selection for each step of model establishment and discussed the characteristics of each method. We have conducted a detailed analysis of recent research and believe that the practical ability and efficiency of model application can be improved by reducing model dimensions. The key to improving the performance of antioxidant protein recognition models in the future may lie in feature selection, so this paper also focuses on the combination of feature extraction and selection steps in the analysis of the model building process.

Keywords: antioxidant protein identification; feature extraction; feature selection; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Antioxidants*
  • Machine Learning
  • Proteins

Substances

  • Antioxidants
  • Proteins