Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling

Biotechnol Bioeng. 2017 Feb;114(2):321-334. doi: 10.1002/bit.26073. Epub 2016 Aug 30.

Abstract

The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc.

Keywords: Escherichia coli; boosting; machine learning; modeling; recombinant protein production; structured additive regression model.

MeSH terms

  • Algorithms
  • Batch Cell Culture Techniques / methods*
  • Bioreactors / microbiology
  • Escherichia coli / genetics
  • Escherichia coli / metabolism*
  • Fermentation
  • Machine Learning
  • Models, Biological*
  • Recombinant Proteins / chemistry*
  • Recombinant Proteins / genetics
  • Recombinant Proteins / metabolism*
  • Regression Analysis
  • Solubility

Substances

  • Recombinant Proteins