A computational learning paradigm to targeted discovery of biocatalysts from metagenomic data: A case study of lipase identification

Biotechnol Bioeng. 2022 Apr;119(4):1115-1128. doi: 10.1002/bit.28037. Epub 2022 Feb 3.

Abstract

The growing adoption of enzymes as biocatalysts in various industries has accentuated the demand for acquiring access to the great natural diversity and, in the meantime, the advent and advancements of metagenomics and high-throughput sequencing technologies have offered an unprecedented opportunity to explore this extensive resource. Lipases, enzymes responsible for the biological turnover of lipids, are among the most commercialized biocatalysts with numerous applications in different domains and therefore are of high industrial value. The relatively costly and time-consuming wet-lab experimental pipelines commonly used for novel enzyme discovery, highlight the necessity of agile in silico approaches to keep pace with the exponential growth of available sequencing data. In the present study, an in-depth analysis of a tannery wastewater metagenome, including taxonomic and enzymatic profiling, was performed. Using sequence homology-based screening methods and supervised machine learning-based regression models aimed at prediction of lipases' pH and temperature optima, the metagenomic data set was screened for lipolytic enzymes, which led to the isolation of alkaline and highly thermophilic novel lipase. Moreover, MeTarEnz (metagenomic targeted enzyme miner) software was developed and made freely accessible (at https://cbb.ut.ac.ir/MeTarEnz) as a part of this study. MeTarEnz offers several functions to automate the process of targeted enzyme mining from high-throughput sequencing data. This study highlights the competence of computational approaches in exploring vast biodiversity within environmental niches, while providing a set of practical in silico tools as well as a generalized methodology to facilitate the sequence-based mining of biocatalysts.

Keywords: lipase; machine learning; metagenomics; sequence-based; targeted biocatalyst discovery.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • High-Throughput Nucleotide Sequencing / methods
  • Lipase / chemistry
  • Lipase / genetics
  • Metagenome*
  • Metagenomics* / methods
  • Temperature

Substances

  • Lipase