Feature space resampling for protein conformational search

Ben Blum; Michael I Jordan; David Baker

doi:10.1002/prot.22677

Feature space resampling for protein conformational search

Proteins. 2010 May 1;78(6):1583-93. doi: 10.1002/prot.22677.

Authors

Ben Blum¹, Michael I Jordan, David Baker

Affiliation

¹ Department of Electrical Engineering and Computer Science, University of California, Berkeley, 94720, USA. benblum@gmail.com

Abstract

De novo protein structure prediction requires location of the lowest energy state of the polypeptide chain among a vast set of possible conformations. Powerful approaches include conformational space annealing, in which search progressively focuses on the most promising regions of conformational space, and genetic algorithms, in which features of the best conformations thus far identified are recombined. We describe a new approach that combines the strengths of these two approaches. Protein conformations are projected onto a discrete feature space which includes backbone torsion angles, secondary structure, and beta pairings. For each of these there is one "native" value: the one found in the native structure. We begin with a large number of conformations generated in independent Monte Carlo structure prediction trajectories from Rosetta. Native values for each feature are predicted from the frequencies of feature value occurrences and the energy distribution in conformations containing them. A second round of structure prediction trajectories are then guided by the predicted native feature distributions. We show that native features can be predicted at much higher than background rates, and that using the predicted feature distributions improves structure prediction in a benchmark of 28 proteins. The advantages of our approach are that features from many different input structures can be combined simultaneously without producing atomic clashes or otherwise physically inviable models, and that the features being recombined have a relatively high chance of being correct.

2009 Wiley-Liss, Inc.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Databases, Protein
Protein Structure, Secondary*
Proteins / chemistry*

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding