Building a machine-learning model to predict optimal mevalonate pathway gene expression levels for efficient production of a carotenoid in yeast

Biotechnol J. 2024 Jan;19(1):e2300285. doi: 10.1002/biot.202300285. Epub 2023 Nov 27.

Abstract

Simultaneous modification of the expression levels of many metabolic enzyme genes results in diverse expression ratios of these genes; however, the relationship between gene expression levels and chemical productivity remains unclear. However, clarification of this relationship is expected to improve the productivity of useful chemicals. Supervised machine learning is considered to be an effective means to clarify this relationship. In this study, to improve the productivity of carotenoids in yeast Saccharomyces cerevisiae, we aimed to build a machine-learning model that can predict the optimal gene expression level for carotenoid production. First, we obtained data on the expression levels of mevalonate pathway enzyme genes and carotenoid production. Then, based on these data, we built a machine-learning model to predict carotenoid productivity based on gene expression levels. The prediction accuracy of 0.6292 (coefficient of determination) was achieved using the test data. The maximum predicted carotenoid productivity was 4.3 times higher in the engineered strain than in the parental strain, suggesting that the expression levels of the mevalonate pathway enzyme genes tHMG1 and ERG8 have a particularly large impact on carotenoid productivity. This study could be one of the important achievements in addressing the uncertainty of genotype-phenotype correlations, which is one of the challenges facing metabolic engineering strategies.

Keywords: carotenoid; machine-learning; metabolic engineering; saccharomyces cerevisiae; yeast.

MeSH terms

  • Carotenoids / metabolism
  • Gene Expression
  • Machine Learning
  • Metabolic Engineering / methods
  • Mevalonic Acid* / metabolism
  • Saccharomyces cerevisiae* / metabolism

Substances

  • Mevalonic Acid
  • Carotenoids