Phage_UniR_LGBM: Phage Virion Proteins Classification with UniRep Features and LightGBM Model

Comput Math Methods Med. 2022 Apr 15:2022:9470683. doi: 10.1155/2022/9470683. eCollection 2022.

Abstract

Phage, the most prevalent creature on the planet, serves a variety of critical roles. Phage's primary role is to facilitate gene-to-gene communication. The phage proteins can be defined as the virion proteins and the nonvirion ones. Nowadays, experimental identification is a difficult process that necessitates a significant amount of laboratory time and expense. Considering such situation, it is critical to design practical calculating techniques and develop well-performance tools. In this work, the Phage_UniR_LGBM has been proposed to classify the virion proteins. In detailed, such model utilizes the UniRep as the feature and the LightGBM algorithm as the classification model. And then, the training data train the model, and the testing data test the model with the cross-validation. The Phage_UniR_LGBM was compared with the several state-of-the-art features and classification algorithms. The performances of the Phage_UniR_LGBM are 88.51% in Sp,89.89% in Sn, 89.18% in Acc, 0.7873 in MCC, and 0.8925 in F1 score.

MeSH terms

  • Algorithms
  • Bacteriophages* / metabolism
  • Computational Biology / methods
  • Humans
  • Proteins / metabolism
  • Virion / metabolism

Substances

  • Proteins