ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing

Junyu Yan; Shuai Li; Ying Zhang; Aimin Hao; Qinping Zhao

doi:10.1093/bib/bbad257

ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing

Brief Bioinform. 2023 Jul 20;24(4):bbad257. doi: 10.1093/bib/bbad257.

Authors

Junyu Yan¹, Shuai Li¹, Ying Zhang², Aimin Hao¹, Qinping Zhao¹

Affiliations

¹ State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China.
² The Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Sciences, Beijing Normal University, Beijing, China.

PMID: 37429578
DOI: 10.1093/bib/bbad257

Abstract

Computational protein design has been demonstrated to be the most powerful tool in the last few years among protein designing and repacking tasks. In practice, these two tasks are strongly related but often treated separately. Besides, state-of-the-art deep-learning-based methods cannot provide interpretability from an energy perspective, affecting the accuracy of the design. Here we propose a new systematic approach, including both a posterior probability and a joint probability parts, to solve the two essential questions once for all. This approach takes the physicochemical property of amino acids into consideration and uses the joint probability model to ensure the convergence between structure and amino acid type. Our results demonstrated that this method could generate feasible, high-confidence sequences with low-energy side conformations. The designed sequences can fold into target structures with high confidence and maintain relatively stable biochemical properties. The side chain conformation has a significantly lower energy landscape without delegating to a rotamer library or performing the expensive conformational searches. Overall, we propose an end-to-end method that combines the advantages of both deep learning and energy-based methods. The design results of this model demonstrate high efficiency, and precision, as well as a low energy state and good interpretability.

Keywords: Deep learning; Fixed-backbone design; High structural accuracy; Protein sequence design.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence
Amino Acids / chemistry
Deep Learning*
Models, Molecular
Protein Conformation
Proteins / chemistry

Substances

Proteins
Amino Acids