Back to the roots of a new exon--the molecular archaeology of a SP100 splice variant

Genomics. 2000 Jan 1;63(1):117-22. doi: 10.1006/geno.1999.6008.

Abstract

Retropseudogenes are intronless DNA sequences sharing a high degree of homology with the cDNA of their corresponding active genes. They are thought to have originated by reverse transcription of messenger RNA and reintegration of the cDNA into the genome. Usually considered a type of evolutionary waste, they melt into the background of their surrounding DNA by the loss of similarity to the active gene or disappear from the genome by the accumulation of deletions. On the other hand, in this paper we describe the evolutionary recycling of this genomic waste. Recently, a splice variant of the gene encoding the nuclear protein SP100 was identified in which the 3' part of the cDNA is replaced by an alternative exon apparently encoding an HMG1-DNA-binding domain. We were able to show that this HMG box is contributed by a new exon arising from an HMG1 retropseudogene that we have molecularly characterized in detail. In addition to being found in human cells, corresponding fusion transcripts were shown in Pan troglodytes, Gorilla gorilla, and Hylobates lar, but not in Macaca mulatta. Genomic DNA from M. mulatta enabled us to amplify by PCR the 5' part but not the 3' part of the HMG1 retropseudogene. From our data we thus can date the underlying retrotransposition to more than 35 million years ago. Our findings offer a model as to how new exons may evolve during evolution. To our knowledge this is the first example of a retropseudogene becoming part of an active gene in which both parental parts are well characterized and remain in-frame with their cDNA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Amino Acid Sequence
  • Animals
  • Antigens, Nuclear*
  • Autoantigens / genetics*
  • Autoantigens / metabolism
  • Base Sequence
  • Blotting, Northern
  • Cells, Cultured
  • Evolution, Molecular
  • Exons*
  • Genetic Variation
  • HeLa Cells
  • High Mobility Group Proteins / genetics*
  • High Mobility Group Proteins / metabolism
  • Hominidae
  • Humans
  • Hylobates
  • Macaca mulatta
  • Molecular Sequence Data
  • Nuclear Proteins / genetics*
  • Nuclear Proteins / metabolism
  • Polymerase Chain Reaction
  • Primates / genetics*
  • Pseudogenes*
  • Retroelements*

Substances

  • Antigens, Nuclear
  • Autoantigens
  • High Mobility Group Proteins
  • Nuclear Proteins
  • Retroelements
  • SP100 protein, human

Associated data

  • GENBANK/AF076675
  • GENBANK/AF146342