Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference

Proc Natl Acad Sci U S A. 2015 Jun 2;112(22):6985-90. doi: 10.1073/pnas.1506788112. Epub 2015 May 18.

Abstract

More than 100,000 protein structures are now known at atomic detail. However, far more are not yet known, particularly among large or complex proteins. Often, experimental information is only semireliable because it is uncertain, limited, or confusing in important ways. Some experiments give sparse information, some give ambiguous or nonspecific information, and others give uncertain information-where some is right, some is wrong, but we don't know which. We describe a method called Modeling Employing Limited Data (MELD) that can harness such problematic information in a physics-based, Bayesian framework for improved structure determination. We apply MELD to eight proteins of known structure for which such problematic structural data are available, including a sparse NMR dataset, two ambiguous EPR datasets, and four uncertain datasets taken from sequence evolution data. MELD gives excellent structures, indicating its promise for experimental biomolecule structure determination where only semireliable data are available.

Keywords: Bayesian inference; integrative structural biology; molecular modeling; protein structure.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Models, Molecular*
  • Molecular Biology / methods*
  • Protein Conformation
  • Proteins / chemistry*

Substances

  • Proteins