Gene Fusions Derived by Transcriptional Readthrough are Driven by Segmental Duplication in Human

Genome Biol Evol. 2019 Sep 1;11(9):2678-2690. doi: 10.1093/gbe/evz163.

Abstract

Gene fusion occurs when two or more individual genes with independent open reading frames becoming juxtaposed under the same open reading frame creating a new fused gene. A small number of gene fusions described in detail have been associated with novel functions, for example, the hominid-specific PIPSL gene, TNFSF12, and the TWE-PRIL gene family. We use Sequence Similarity Networks and species level comparisons of great ape genomes to identify 45 new genes that have emerged by transcriptional readthrough, that is, transcription-derived gene fusion. For 35 of these putative gene fusions, we have been able to assess available RNAseq data to determine whether there are reads that map to each breakpoint. A total of 29 of the putative gene fusions had annotated transcripts (9/29 of which are human-specific). We carried out RT-qPCR in a range of human tissues (placenta, lung, liver, brain, and testes) and found that 23 of the putative gene fusion events were expressed in at least one tissue. Examining the available ribosome foot-printing data, we find evidence for translation of three of the fused genes in human. Finally, we find enrichment for transcription-derived gene fusions in regions of known segmental duplication in human. Together, our results implicate chromosomal structural variation brought about by segmental duplication with the emergence of novel transcripts and translated protein products.

Keywords: Great Ape Comparative genomics; mechanisms of protein-coding evolution; novel genes; segmental duplication; sequence similarity networks; transcriptional readthrough.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Evolution, Molecular*
  • Gene Fusion*
  • Humans
  • Mice
  • Nucleotide Motifs
  • Phylogeny
  • Primates / genetics
  • Protein Biosynthesis
  • RNA Splice Sites
  • Recombination, Genetic
  • Reverse Transcriptase Polymerase Chain Reaction
  • Segmental Duplications, Genomic*

Substances

  • RNA Splice Sites