An integrated approach to optimization of Escherichia coli fermentations using historical data

Biotechnol Bioeng. 2003 Nov 5;84(3):274-85. doi: 10.1002/bit.10719.

Abstract

Using a fermentation database for Escherichia coli producing green fluorescent protein (GFP), we have implemented a novel three-step optimization method to identify the process input variables most important in modeling the fermentation, as well as the values of those critical input variables that result in an increase in the desired output. In the first step of this algorithm, we use either decision-tree analysis (DTA) or information theoretic subset selection (ITSS) as a database mining technique to identify which process input variables best classify each of the process outputs (maximum cell concentration, maximum product concentration, and productivity) monitored in the experimental fermentations. The second step of the optimization method is to train an artificial neural network (ANN) model of the process input-output data, using the critical inputs identified in the first step. Finally, a hybrid genetic algorithm (hybrid GA), which includes both gradient and stochastic search methods, is used to identify the maximum output modeled by the ANN and the values of the input conditions that result in that maximum. The results of the database mining techniques are compared, both in terms of the inputs selected and the subsequent ANN performance. For the E. coli process used in this study, we identified 6 inputs from the original 13 that resulted in an ANN that best modeled the GFP fluorescence outputs of an independent test set. Values of the six inputs that resulted in a modeled maximum fluorescence were identified by applying a hybrid GA to the ANN model developed. When these conditions were tested in laboratory fermentors, an actual maximum fluorescence of 2.16E6 AU was obtained. The previous high value of fluorescence that was observed was 1.51E6 AU. Thus, this input condition set that was suggested by implementing the proposed optimization scheme on the available historical database increased the maximum fluorescence by 55%.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Bioreactors / microbiology*
  • Cell Culture Techniques / methods
  • Databases, Factual*
  • Escherichia coli / growth & development*
  • Escherichia coli / metabolism*
  • Expert Systems*
  • Feedback / physiology
  • Fermentation / physiology
  • Green Fluorescent Proteins
  • Information Storage and Retrieval / methods
  • Luminescent Proteins / biosynthesis*
  • Models, Biological*
  • Neural Networks, Computer
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Systems Integration

Substances

  • Luminescent Proteins
  • Green Fluorescent Proteins