Mapping Conformational Space of All 8000 Tripeptides by Quantum Chemical Methods: What Strain Is Affordable within Folded Protein Chains?

J Phys Chem B. 2021 Jan 14;125(1):58-69. doi: 10.1021/acs.jpcb.0c09251. Epub 2021 Jan 4.

Abstract

To gain more insight into the physicochemical aspects of a protein structure from the first principles, conformational space of all 8000 "capped" tripeptides (i.e., N-Ac-X1X2X3-NH-CH3, where Xi is one of the 20 natural amino acids) was investigated computationally. An enormous dataset (denoted P-CONF_1.6M and containing close to 1 600 000 conformers in total) has been obtained by employing a composite protocol combining density functional theory, semiempirical quantum mechanics (SQM), and state-of-the-art solvation methods with 1000 K molecular dynamics (MD) used to generate initial structures (200 snapshots for each tripeptide). This allowed us to present the first rigorous QM-based glimpse at the vast conformational space spanned by small protein fragments. The same computational procedure was repeated for tripeptide fragments taken from the SCOPe database of three-dimensional protein folds, by restraining them to their geometry in a protein. Such complementary data allowed us to compare the distribution of conformational strain energies of unrestrained tripeptidic fragments "in solvent" with those in existing protein chains. Besides providing a rigorous (ab initio) proof of a few well-known concepts and hypotheses concerning protein structures, such as the distribution of (φ, ψ) angles in Ramachandran plots, we have made several observations that came as a certain surprise: (1) distribution of conformational energies does not significantly differ between the "unbiased/unrestrained" conformers obtained from MD sampling in solvent and the biased conformers, i.e., those of a given tripeptide obtained from protein structures; (2) conformational (strain) energy window up to ∼20 to 25 kcal·mol-1 is readily available to tripeptide fragments within the context of a protein chain; (3) overpopulation in certain regions of Ramachandran plot was observed for the unbiased conformers. Last but not least, the massive dataset of accurate (DFT-D3//COSMO-RS) conformational (free) energies of ∼1.6 M peptide conformers, P-CONF_1.6M, obtained throughout this work may serve as excellent dataset for calibrating and benchmarking of popular force fields.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids
  • Molecular Conformation
  • Molecular Dynamics Simulation
  • Peptides*
  • Proteins*
  • Quantum Theory

Substances

  • Amino Acids
  • Peptides
  • Proteins