SLPred: a multi-view subcellular localization prediction tool for multi-location human proteins

Bioinformatics. 2022 Sep 2;38(17):4226-4229. doi: 10.1093/bioinformatics/btac458.

Abstract

Summary: Accurate prediction of the subcellular locations (SLs) of proteins is a critical topic in protein science. In this study, we present SLPred, an ensemble-based multi-view and multi-label protein subcellular localization prediction tool. For a query protein sequence, SLPred provides predictions for nine main SLs using independent machine-learning models trained for each location. We used UniProtKB/Swiss-Prot human protein entries and their curated SL annotations as our source data. We connected all disjoint terms in the UniProt SL hierarchy based on the corresponding term relationships in the cellular component category of Gene Ontology and constructed a training dataset that is both reliable and large scale using the re-organized hierarchy. We tested SLPred on multiple benchmarking datasets including our-in house sets and compared its performance against six state-of-the-art methods. Results indicated that SLPred outperforms other tools in the majority of cases.

Availability and implementation: SLPred is available both as an open-access and user-friendly web-server (https://slpred.kansil.org) and a stand-alone tool (https://github.com/kansil/SLPred). All datasets used in this study are also available at https://slpred.kansil.org.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology* / methods
  • Databases, Protein
  • Gene Ontology
  • Humans
  • Protein Transport
  • Proteins* / genetics

Substances

  • Proteins