Top-Down Machine Learning of Coarse-Grained Protein Force Fields

Carles Navarro; Maciej Majewski; Gianni De Fabritiis

doi:10.1021/acs.jctc.3c00638

Top-Down Machine Learning of Coarse-Grained Protein Force Fields

J Chem Theory Comput. 2023 Nov 14;19(21):7518-7526. doi: 10.1021/acs.jctc.3c00638. Epub 2023 Oct 24.

Authors

Carles Navarro¹, Maciej Majewski¹, Gianni De Fabritiis^{2

3

4}

Affiliations

¹ Acellera Labs, Doctor Trueta 183, 08005 Barcelona, Spain.
² Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003 Barcelona, Spain.
³ Acellera Ltd., Devonshire House 582, Middlesex HA7 1JS, United Kingdom.
⁴ Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain.

Abstract

Developing accurate and efficient coarse-grained representations of proteins is crucial for understanding their folding, function, and interactions over extended time scales. Our methodology involves simulating proteins with molecular dynamics and utilizing the resulting trajectories to train a neural network potential through differentiable trajectory reweighting. Remarkably, this method requires only the native conformation of proteins, eliminating the need for labeled data derived from extensive simulations or memory-intensive end-to-end differentiable simulations. Once trained, the model can be employed to run parallel molecular dynamics simulations and sample folding events for proteins both within and beyond the training distribution, showcasing its extrapolation capabilities. By applying Markov state models, native-like conformations of the simulated proteins can be predicted from the coarse-grained simulations. Owing to its theoretical transferability and ability to use solely experimental static structures as training data, we anticipate that this approach will prove advantageous for developing new protein force fields and further advancing the study of protein dynamics, folding, and interactions.

MeSH terms

Machine Learning
Molecular Dynamics Simulation*
Protein Conformation
Protein Folding
Proteins* / chemistry

Substances

Proteins