An all-atom protein generative model

bioRxiv [Preprint]. 2023 May 25:2023.05.24.542194. doi: 10.1101/2023.05.24.542194.

Abstract

Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which instantiates a "superposition" over the possible sidechain states, and collapses it to conduct reverse diffusion for sample generation. When combined with sequence design methods, our model is able to co-design all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.

Publication types

  • Preprint