StructureDistiller: Structural relevance scoring identifies the most informative entries of a contact map

Sci Rep. 2019 Dec 6;9(1):18517. doi: 10.1038/s41598-019-55047-4.

Abstract

Protein folding and structure prediction are two sides of the same coin. Contact maps and the related techniques of constraint-based structure reconstruction can be considered as unifying aspects of both processes. We present the Structural Relevance (SR) score which quantifies the information content of individual contacts and residues in the context of the whole native structure. The physical process of protein folding is commonly characterized with spatial and temporal resolution: some residues are Early Folding while others are Highly Stable with respect to unfolding events. We employ the proposed SR score to demonstrate that folding initiation and structure stabilization are subprocesses realized by distinct sets of residues. The example of cytochrome c is used to demonstrate how StructureDistiller identifies the most important contacts needed for correct protein folding. This shows that entries of a contact map are not equally relevant for structural integrity. The proposed StructureDistiller algorithm identifies contacts with the highest information content; these entries convey unique constraints not captured by other contacts. Identification of the most informative contacts effectively doubles resilience toward contacts which are not observed in the native contact map. Furthermore, this knowledge increases reconstruction fidelity on sparse contact maps significantly by 0.4 Å.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • Cytochromes c / chemistry
  • Databases, Protein*
  • Horses
  • Hydrogen / chemistry
  • Hydrogen Bonding
  • Mutation
  • Myocardium / metabolism
  • Protein Conformation*
  • Protein Folding
  • Proteins / chemistry
  • Software

Substances

  • Proteins
  • Hydrogen
  • Cytochromes c