TSE-ARF: An adaptive prediction method of effectors across secretion system types

Anal Biochem. 2024 Mar:686:115407. doi: 10.1016/j.ab.2023.115407. Epub 2023 Nov 28.

Abstract

Bacterial effector proteins are secreted by a variety of protein secretion systems and play an important role in the interaction between the host and pathogenic bacteria. Therefore, it is important to find a fast and inexpensive method to discover bacterial effectors. In this study, we propose a multi-type secretion effector adaptive random forest (TSE-ARF) to adaptively identify secretion effectors across T1SE-T4SE and T6SE based only on protein sequences. First, we proposed two new feature descriptors by considering some characteristic protein information and fused them with some universal features to form a 290-dimensional feature vector with good versatility. Then, the TSE-ARF model was used to make classification predictions by parameter adaptation of different secretion effectors integrating Shuffled Frog Leaping Algorithm and random forest. The perfect performance in TSE-ARF under different data sets and settings shows its considerable generalization ability, with which more candidate effectors were screened in the whole genome. Source code is available at https://github.com/AIMOVE/TSE-ARF.

Keywords: Attention mechanism; Enhanced feature extraction; Multi-layer fusion; Multi-scale information; Secretion system.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bacteria / metabolism
  • Bacterial Proteins / metabolism
  • Random Forest*
  • Software

Substances

  • Bacterial Proteins