History-Driven Genetic Modification Design Technique Using a Domain-Specific Lexical Model for the Acceleration of DBTL Cycles for Microbial Cell Factories

ACS Synth Biol. 2021 Sep 17;10(9):2308-2317. doi: 10.1021/acssynbio.1c00234. Epub 2021 Aug 5.

Abstract

The development of microbes for conducting bioprocessing via synthetic biology involves design-build-test-learn (DBTL) cycles. To aid the designing step, we developed a computational technique that suggests next genetic modifications on the basis of relatedness to the user's design history of genetic modifications accumulated through former DBTL cycles conducted by the user. This technique, which comprehensively retrieves well-known designs related to the history, involves searching text for previous literature and then mining genes that frequently co-occur in the literature with those modified genes. We further developed a domain-specific lexical model that weights literature that is more related to the domain of metabolic engineering to emphasize genes modified for bioprocessing. Our technique made a suggestion by using a history of creating a Corynebacterium glutamicum strain producing shikimic acid that had 18 genetic modifications. Inspired by the suggestion, eight genes were considered by biologists for further modification, and modifying four of these genes proved experimentally efficient in increasing the production of shikimic acid. These results indicated that our proposed technique successfully utilized the former cycles to suggest relevant designs that biologists considered worth testing. Comprehensive retrieval of well-tested designs will help less-experienced researchers overcome the entry barrier as well as inspire experienced researchers to formulate design concepts that have been overlooked or suspended. This technique will aid DBTL cycles by feeding histories back to the next genetic design, thereby complementing the designing step.

Keywords: DBTL cycle; bioinformatics; genetic modification design; microbial cell factories; similarity search; text mining.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Corynebacterium glutamicum / genetics*
  • Corynebacterium glutamicum / metabolism
  • Glucose / metabolism
  • Metabolic Engineering / methods
  • Metabolic Networks and Pathways / genetics
  • Multigene Family
  • Research Design
  • Shikimic Acid / metabolism
  • Synthetic Biology / methods*

Substances

  • Shikimic Acid
  • Glucose