Controlling gene expression with deep generative design of regulatory DNA

Nat Commun. 2022 Aug 30;13(1):5099. doi: 10.1038/s41467-022-32818-8.

Abstract

Design of de novo synthetic regulatory DNA is a promising avenue to control gene expression in biotechnology and medicine. Using mutagenesis typically requires screening sizable random DNA libraries, which limits the designs to span merely a short section of the promoter and restricts their control of gene expression. Here, we prototype a deep learning strategy based on generative adversarial networks (GAN) by learning directly from genomic and transcriptomic data. Our ExpressionGAN can traverse the entire regulatory sequence-expression landscape in a gene-specific manner, generating regulatory DNA with prespecified target mRNA levels spanning the whole gene regulatory structure including coding and adjacent non-coding regions. Despite high sequence divergence from natural DNA, in vivo measurements show that 57% of the highly-expressed synthetic sequences surpass the expression levels of highly-expressed natural controls. This demonstrates the applicability and relevance of deep generative design to expand our knowledge and control of gene expression regulation in any desired organism, condition or tissue.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / genetics
  • Gene Expression
  • Gene Expression Regulation
  • Genome*
  • Genomics*

Substances

  • DNA