StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

BMC Bioinformatics. 2011 Jun 2:12:226. doi: 10.1186/1471-2105-12-226.

Abstract

Background: Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures.

Results: Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses.

Conclusions: StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acid Motifs
  • Amino Acid Sequence
  • DNA Primers / genetics
  • Models, Molecular
  • Molecular Sequence Data
  • Poliovirus / enzymology*
  • Poliovirus / metabolism
  • RNA-Dependent RNA Polymerase / chemistry*
  • RNA-Dependent RNA Polymerase / metabolism
  • Structural Homology, Protein*

Substances

  • DNA Primers
  • RNA-Dependent RNA Polymerase