PL-search: a profile-link-based search method for protein remote homology detection

Brief Bioinform. 2021 May 20;22(3):bbaa051. doi: 10.1093/bib/bbaa051.

Abstract

Protein remote homology detection is a fundamental and important task for protein structure and function analysis. Several search methods have been proposed to improve the detection performance of the remote homologues and the accuracy of ranking lists. The position-specific scoring matrix (PSSM) profile and hidden Markov model (HMM) profile can contribute to improving the performance of the state-of-the-art search methods. In this paper, we improved the profile-link (PL) information for constructing PSSM or HMM profiles, and proposed a PL-based search method (PL-search). In PL-search, more robust PLs are constructed through the double-link and iterative extending strategies, and an accurate similarity score of sequence pairs is calculated from the two-level Jaccard distance for remote homologues. We tested our method on two widely used benchmark datasets. Our results show that whether HHblits, JackHMMER or position-specific iterated-BLAST is used, PL-search obviously improves the search performance in terms of ranking quality as well as the number of detected remote homologues. For ease of use of PL-search, both its stand-alone tool and the web server are constructed, which can be accessed at http://bliulab.net/PL-search/.

Keywords: HHblits; Jaccard distance; profile-link-based search method (PL-search); protein remote homology detection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Protein
  • Datasets as Topic
  • Position-Specific Scoring Matrices
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / metabolism*
  • Sequence Analysis, Protein / methods

Substances

  • Proteins