Machine Learning-Guided Optimization of p-Coumaric Acid Production in Yeast

ACS Synth Biol. 2024 Apr 19;13(4):1312-1322. doi: 10.1021/acssynbio.4c00035. Epub 2024 Mar 28.

Abstract

Industrial biotechnology uses Design-Build-Test-Learn (DBTL) cycles to accelerate the development of microbial cell factories, required for the transition to a biobased economy. To use them effectively, appropriate connections between the phases of the cycle are crucial. Using p-coumaric acid (pCA) production in Saccharomyces cerevisiae as a case study, we propose the use of one-pot library generation, random screening, targeted sequencing, and machine learning (ML) as links during DBTL cycles. We showed that the robustness and flexibility of the ML models strongly enable pathway optimization and propose feature importance and Shapley additive explanation values as a guide to expand the design space of original libraries. This approach allowed a 68% increased production of pCA within two DBTL cycles, leading to a 0.52 g/L titer and a 0.03 g/g yield on glucose.

Keywords: DBTL; Saccharomyces cerevisiae; machine learning; one-pot library.

MeSH terms

  • Coumaric Acids* / metabolism
  • Machine Learning
  • Metabolic Engineering
  • Saccharomyces cerevisiae* / genetics

Substances

  • p-coumaric acid
  • Coumaric Acids