Multiomics Data Collection, Visualization, and Utilization for Guiding Metabolic Engineering

Somtirtha Roy; Tijana Radivojevic; Mark Forrer; Jose Manuel Marti; Vamshi Jonnalagadda; Tyler Backman; William Morrell; Hector Plahar; Joonhoon Kim; Nathan Hillson; Hector Garcia Martin

doi:10.3389/fbioe.2021.612893

Multiomics Data Collection, Visualization, and Utilization for Guiding Metabolic Engineering

Front Bioeng Biotechnol. 2021 Feb 9:9:612893. doi: 10.3389/fbioe.2021.612893. eCollection 2021.

Authors

Somtirtha Roy^{1

2}, Tijana Radivojevic^{1

2

3}, Mark Forrer^{2

3

4}, Jose Manuel Marti^{1

2

3}, Vamshi Jonnalagadda^{1

2}, Tyler Backman^{1

3}, William Morrell^{2

3

4}, Hector Plahar^{1

2}, Joonhoon Kim^{3

5}, Nathan Hillson^{1

2

3}, Hector Garcia Martin^{1

2

3

6}

Affiliations

¹ Lawrence Berkeley National Laboratory, Biological Systems and Engineering Division, Berkeley, CA, United States.
² Department of Energy, Agile BioFoundry, Emeryville, CA, United States.
³ Joint BioEnergy Institute, Emeryville, CA, United States.
⁴ Sandia National Laboratories, Biomaterials and Biomanufacturing, Livermore, CA, United States.
⁵ Chemical and Biological Processes Development Group, Pacific Northwest National Laboratory, Richland, WA, United States.
⁶ BCAM, Basque Center for Applied Mathematics, Bilbao, Spain.

Abstract

Biology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we showcase a set of computational tools that, combined, provide the ability to store, visualize, and leverage multiomics data to predict the outcome of bioengineering efforts. We show how to upload, visualize, and output multiomics data, as well as strain information, into online repositories for several isoprenol-producing strain designs. We then use these data to train machine learning algorithms that recommend new strain designs that are correctly predicted to improve isoprenol production by 23%. This demonstration is done by using synthetic data, as provided by a novel library, that can produce credible multiomics data for testing algorithms and computational tools. In short, this paper provides a step-by-step tutorial to leverage these computational tools to improve production in bioengineered strains.

Keywords: biofuels; flux analysis; machine learning; metabolic engineering; multiomics analysis; synthetic biology.