Stack-VTP: prediction of vesicle transport proteins based on stacked ensemble classifier and evolutionary information

BMC Bioinformatics. 2023 Apr 7;24(1):137. doi: 10.1186/s12859-023-05257-5.

Abstract

Vesicle transport proteins not only play an important role in the transmembrane transport of molecules, but also have a place in the field of biomedicine, so the identification of vesicle transport proteins is particularly important. We propose a method based on ensemble learning and evolutionary information to identify vesicle transport proteins. Firstly, we preprocess the imbalanced dataset by random undersampling. Secondly, we extract position-specific scoring matrix (PSSM) from protein sequences, and then further extract AADP-PSSM and RPSSM features from PSSM, and use the Max-Relevance-Max-Distance (MRMD) algorithm to select the optimal feature subset. Finally, the optimal feature subset is fed into the stacked classifier for vesicle transport proteins identification. The experimental results show that the of accuracy (ACC), sensitivity (SN) and specificity (SP) of our method on the independent testing set are 82.53%, 0.774 and 0.836, respectively. The SN, SP and ACC of our proposed method are 0.013, 0.007 and 0.76% higher than the current state-of-the-art methods.

Keywords: Ensemble learning; Protein prediction; Stacked model; Vesicle transport proteins.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Carrier Proteins
  • Position-Specific Scoring Matrices*
  • Support Vector Machine
  • Vesicular Transport Proteins* / genetics
  • Vesicular Transport Proteins* / isolation & purification

Substances

  • Carrier Proteins
  • Vesicular Transport Proteins