Frequency of intron loss correlates with processed pseudogene abundance: a novel strategy to test the reverse transcriptase model of intron loss

BMC Biol. 2013 Mar 5:11:23. doi: 10.1186/1741-7007-11-23.

Abstract

Background: Although intron loss in evolution has been described, the mechanism involved is still unclear. Three models have been proposed, the reverse transcriptase (RT) model, genomic deletion model and double-strand-break repair model. The RT model, also termed mRNA-mediated intron loss, suggests that cDNA molecules reverse transcribed from spliced mRNA recombine with genomic DNA causing intron loss. Many studies have attempted to test this model based on its predictions, such as simultaneous loss of adjacent introns, 3'-side bias of intron loss, and germline expression of intron-lost genes. Evidence either supporting or opposing the model has been reported. The mechanism of intron loss proposed in the RT model shares the process of reverse transcription with the formation of processed pseudogenes. If the RT model is correct, genes that have produced more processed pseudogenes are more likely to undergo intron loss.

Results: In the present study, we observed that the frequency of intron loss is correlated with processed pseudogene abundance by analyzing a new dataset of intron loss obtained in mice and rats. Furthermore, we found that mRNA molecules of intron-lost genes are mostly translated on free cytoplasmic ribosomes, a feature shared by mRNA molecules of the parental genes of processed pseudogenes and long interspersed elements. This feature is likely convenient for intron-lost gene mRNA molecules to be reverse transcribed. Analyses of adjacent intron loss, 3'-side bias of intron loss, and germline expression of intron-lost genes also support the RT model.

Conclusions: Compared with previous evidence, the correlation between the abundance of processed pseudogenes and intron loss frequency more directly supports the RT model of intron loss. Exploring such a correlation is a new strategy to test the RT model in organisms with abundant processed pseudogenes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Conserved Sequence
  • Gene Expression Regulation
  • Genome / genetics
  • Introns / genetics*
  • Long Interspersed Nucleotide Elements / genetics
  • Mammals / genetics
  • Mice
  • Models, Genetic*
  • Phylogeny
  • Protein Sorting Signals / genetics
  • Protein Structure, Tertiary
  • Proteins / genetics
  • Proteins / metabolism
  • Pseudogenes / genetics*
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • RNA-Directed DNA Polymerase / metabolism*
  • Rats
  • Sequence Deletion
  • Solubility

Substances

  • Protein Sorting Signals
  • Proteins
  • RNA, Messenger
  • RNA-Directed DNA Polymerase