iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins

Comput Math Methods Med. 2021 Jan 7:2021:6664362. doi: 10.1155/2021/6664362. eCollection 2021.

Abstract

Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery, cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote, and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful and could provide good performance for BLP identification. The webserver and software package for BLP identification are freely available at http://lin-group.cn/server/iBLP.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Chemical Phenomena
  • Computational Biology
  • Databases, Protein
  • Drug Discovery
  • Luminescence
  • Luminescent Proteins* / chemistry
  • Luminescent Proteins* / genetics
  • Luminescent Proteins* / metabolism
  • Machine Learning
  • Software

Substances

  • Luminescent Proteins