The transcriptional trajectories of pluripotency and differentiation comprise genes with antithetical architecture and repetitive-element content

BMC Biol. 2021 Mar 25;19(1):60. doi: 10.1186/s12915-020-00928-8.

Abstract

Background: Extensive molecular differences exist between proliferative and differentiated cells. Here, we conduct a meta-analysis of publicly available transcriptomic datasets from preimplantation and differentiation stages examining the architectural properties and content of genes whose abundance changes significantly across developmental time points.

Results: Analysis of preimplantation embryos from human and mouse showed that short genes whose introns are enriched in Alu (human) and B (mouse) elements, respectively, have higher abundance in the blastocyst compared to the zygote. These highly expressed genes encode ribosomal proteins or metabolic enzymes. On the other hand, long genes whose introns are depleted in repetitive elements have lower abundance in the blastocyst and include genes from signaling pathways. Additionally, the sequences of the genes that are differentially expressed between the blastocyst and the zygote contain distinct collections of pyknon motifs that differ between up- and down-regulated genes. Further examination of the genes that participate in the stem cell-specific protein interaction network shows that their introns are short and enriched in Alu (human) and B (mouse) elements. As organogenesis progresses, in both human and mouse, we find that the primarily short and repeat-rich expressed genes make way for primarily longer, repeat-poor genes. With that in mind, we used a machine learning-based approach to identify gene signatures able to classify human adult tissues: we find that the most discriminatory genes comprising these signatures have long introns that are repeat-poor and include transcription factors and signaling-cascade genes. The introns of widely expressed genes across human tissues, on the other hand, are short and repeat-rich, and coincide with those with the highest expression at the blastocyst stage.

Conclusions: Protein-coding genes that are characteristic of each trajectory, i.e., proliferation/pluripotency or differentiation, exhibit antithetical biases in their intronic and exonic lengths and in their repetitive-element content. While the respective human and mouse gene signatures are functionally and evolutionarily conserved, their introns and exons are enriched or depleted in organism-specific repetitive elements. We posit that these organism-specific repetitive sequences found in exons and introns are used to effect the corresponding genes' regulation.

Keywords: Embryo development; Exon; Gene length; Genome architecture; Intron; Pyknons; Repetitive elements; Retrotransposons; Tissue specificity; Transcriptional regulation.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Blastocyst / cytology
  • Blastocyst / metabolism
  • Cell Differentiation / genetics*
  • Gene Expression Regulation, Developmental
  • Humans
  • Mice
  • Pluripotent Stem Cells*
  • Repetitive Sequences, Nucleic Acid