Design and deep learning of synthetic B-cell-specific promoters

Nucleic Acids Res. 2023 Nov 27;51(21):11967-11979. doi: 10.1093/nar/gkad930.

Abstract

Synthetic biology and deep learning synergistically revolutionize our ability for decoding and recoding DNA regulatory grammar. The B-cell-specific transcriptional regulation is intricate, and unlock the potential of B-cell-specific promoters as synthetic elements is important for B-cell engineering. Here, we designed and pooled synthesized 23 640 B-cell-specific promoters that exhibit larger sequence space, B-cell-specific expression, and enable diverse transcriptional patterns in B-cells. By MPRA (Massively parallel reporter assays), we deciphered the sequence features that regulate promoter transcriptional, including motifs and motif syntax (their combination and distance). Finally, we built and trained a deep learning model capable of predicting the transcriptional strength of the immunoglobulin V gene promoter directly from sequence. Prediction of thousands of promoter variants identified in the global human population shows that polymorphisms in promoters influence the transcription of immunoglobulin V genes, which may contribute to individual differences in adaptive humoral immune responses. Our work helps to decipher the transcription mechanism in immunoglobulin genes and offers thousands of non-similar promoters for B-cell engineering.

MeSH terms

  • Animals
  • DNA / genetics
  • Deep Learning*
  • Gene Expression Regulation
  • Humans
  • Immunoglobulin Variable Region / genetics
  • Mice
  • Promoter Regions, Genetic

Substances

  • DNA
  • Immunoglobulin Variable Region