Accurate Sampling of Macromolecular Conformations Using Adaptive Deep Learning and Coarse-Grained Representation

J Chem Inf Model. 2022 Apr 11;62(7):1602-1617. doi: 10.1021/acs.jcim.1c01438. Epub 2022 Mar 30.

Abstract

Conformational sampling of protein structures is essential for understanding biochemical functions and for predicting thermodynamic properties such as free energies. Where previous approaches rely on sequential sampling procedures, recent developments in generative deep neural networks rendered possible the parallel, statistically independent sampling of molecular configurations. To be able to accurately generate samples of large molecular systems from a high-dimensional multimodal equilibrium distribution function, we developed a hierarchical approach based on expressive normalizing flows with rational quadratic neural splines and coarse-grained representation. Furthermore, system specific priors and adaptive and property-based controlled learning was designed to diminish the likelihood for the generation of high-energy structures during sampling. Finally, backmapping from a coarse-grained to fully atomistic representation is performed through an equivariant transformer model. We demonstrate the applicability of the method on the one-shot configurational sampling of a protein system with more than a hundred amino acids. The results show enhanced expressivity that diminish the invertibility constraints inherent in the normalizing flow framework. Moreover, the capacity of the hierarchical normalizing flow model was tested on a challenging case study of the folding/unfolding dynamics of the peptide chignolin.

MeSH terms

  • Deep Learning*
  • Macromolecular Substances
  • Molecular Conformation
  • Molecular Dynamics Simulation*
  • Proteins / chemistry
  • Thermodynamics

Substances

  • Macromolecular Substances
  • Proteins