General Protein Data Bank-Based Collective Variables for Protein Folding

J Chem Theory Comput. 2016 Jan 12;12(1):29-35. doi: 10.1021/acs.jctc.5b00714. Epub 2015 Dec 21.

Abstract

New, automated forms of data analysis are required to understand the high-dimensional trajectories that are obtained from molecular dynamics simulations on proteins. Dimensionality reduction algorithms are particularly appealing in this regard as they allow one to construct unbiased, low-dimensional representations of the trajectory using only the information encoded in the trajectory. The downside of this approach is that a different set of coordinates are required for each different chemical system under study precisely because the coordinates are constructed using information from the trajectory. In this paper, we show how one can resolve this problem by using the sketch-map algorithm that we recently proposed to construct a low-dimensional representation of the structures contained in the protein data bank. We show that the resulting coordinates are as useful for analyzing trajectory data as coordinates constructed using landmark configurations taken from the trajectory and that these coordinates can thus be used for understanding protein folding across a range of systems.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / chemistry
  • Bacterial Proteins / metabolism
  • Databases, Protein
  • Protein Folding
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / metabolism

Substances

  • Bacterial Proteins
  • IgG Fc-binding protein, Streptococcus
  • Proteins