Vorescore--fold recognition improved by rescoring of protein structure models

Bioinformatics. 2010 Sep 15;26(18):i474-81. doi: 10.1093/bioinformatics/btq369.

Abstract

Summary: The identification of good protein structure models and their appropriate ranking is a crucial problem in structure prediction and fold recognition. For many alignment methods, rescoring of alignment-induced models using structural information can improve the separation of useful and less useful models as compared with the alignment score. Vorescore, a template-based protein structure model rescoring system is introduced. The method scores the model structure against the template used for the modeling using Vorolign. The method works on models from different alignment methods and incorporates both knowledge from the prediction method and the rescoring.

Results: The performance of Vorescore is evaluated in a large-scale and difficult protein structure prediction context. We use different threading methods to create models for 410 targets, in three scenarios: (i) family members are contained in the template set; (ii) superfamily members (but no family members); and (iii) only fold members (but no family or superfamily members). In all cases Vorescore improves significantly (e.g. 40% on both Gotoh and HHalign at the fold level) on the model quality, and clearly outperforms the state-of-the-art physics-based model scoring system Rosetta. Moreover, Vorescore improves on other successful rescoring approaches such as Pcons and ProQ. In an additional experiment we add high-quality models based on structural alignments to the set, which allows Vorescore to improve the fold recognition rate by another 50%.

Availability: All models of the test set (about 2 million, 44 GB gzipped) are available upon request.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Models, Molecular
  • Models, Structural
  • Protein Conformation*
  • Protein Folding*
  • Proteins / chemistry*
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Software*

Substances

  • Proteins