Data-driven strategies for the computational design of enzyme thermal stability: trends, perspectives, and prospects

Acta Biochim Biophys Sin (Shanghai). 2023 Mar 25;55(3):343-355. doi: 10.3724/abbs.2023033.

Abstract

Thermal stability is one of the most important properties of enzymes, which sustains life and determines the potential for the industrial application of biocatalysts. Although traditional methods such as directed evolution and classical rational design contribute greatly to this field, the enormous sequence space of proteins implies costly and arduous experiments. The development of enzyme engineering focuses on automated and efficient strategies because of the breakthrough of high-throughput DNA sequencing and machine learning models. In this review, we propose a data-driven architecture for enzyme thermostability engineering and summarize some widely adopted datasets, as well as machine learning-driven approaches for designing the thermal stability of enzymes. In addition, we present a series of existing challenges while applying machine learning in enzyme thermostability design, such as the data dilemma, model training, and use of the proposed models. Additionally, a few promising directions for enhancing the performance of the models are discussed. We anticipate that the efficient incorporation of machine learning can provide more insights and solutions for the design of enzyme thermostability in the coming years.

Keywords: data-driven; enzyme design; machine learning; thermal stability.

Publication types

  • Review

MeSH terms

  • Enzyme Stability
  • Protein Engineering*

Grants and funding

This work was supported by the grants from the National Natural Science Foundation of China (No. 32100022) and the Key Research and Development Program of Shandong Province (No. 2020CXGC010601).