Predicting plant Rubisco kinetics from RbcL sequence data using machine learning

J Exp Bot. 2023 Jan 11;74(2):638-650. doi: 10.1093/jxb/erac368.

Abstract

Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) is responsible for the conversion of atmospheric CO2 to organic carbon during photosynthesis, and often acts as a rate limiting step in the later process. Screening the natural diversity of Rubisco kinetics is the main strategy used to find better Rubisco enzymes for crop engineering efforts. Here, we demonstrate the use of Gaussian processes (GPs), a family of Bayesian models, coupled with protein encoding schemes, for predicting Rubisco kinetics from Rubisco large subunit (RbcL) sequence data. GPs trained on published experimentally obtained Rubisco kinetic datasets were applied to over 9000 sequences encoding RbcL to predict Rubisco kinetic parameters. Notably, our predicted kinetic values were in agreement with known trends, e.g. higher carboxylation turnover rates (Kcat) for Rubisco enzymes from C4 or crassulacean acid metabolism (CAM) species, compared with those found in C3 species. This is the first study demonstrating machine learning approaches as a tool for screening and predicting Rubisco kinetics, which could be applied to other enzymes.

Keywords: Enzyme; Gaussian process; Rubisco; kinetics; machine learning; photosynthesis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Carbon / metabolism
  • Carbon Dioxide / metabolism
  • Kinetics
  • Photosynthesis
  • Plants* / metabolism
  • Ribulose-Bisphosphate Carboxylase* / genetics
  • Ribulose-Bisphosphate Carboxylase* / metabolism

Substances

  • Ribulose-Bisphosphate Carboxylase
  • Carbon
  • Carbon Dioxide