Massive colonization of protein-coding exons by selfish genetic elements in Paramecium germline genomes

PLoS Biol. 2021 Jul 29;19(7):e3001309. doi: 10.1371/journal.pbio.3001309. eCollection 2021 Jul.

Abstract

Ciliates are unicellular eukaryotes with both a germline genome and a somatic genome in the same cytoplasm. The somatic macronucleus (MAC), responsible for gene expression, is not sexually transmitted but develops from a copy of the germline micronucleus (MIC) at each sexual generation. In the MIC genome of Paramecium tetraurelia, genes are interrupted by tens of thousands of unique intervening sequences called internal eliminated sequences (IESs), which have to be precisely excised during the development of the new MAC to restore functional genes. To understand the evolutionary origin of this peculiar genomic architecture, we sequenced the MIC genomes of 9 Paramecium species (from approximately 100 Mb in Paramecium aurelia species to >1.5 Gb in Paramecium caudatum). We detected several waves of IES gains, both in ancestral and in more recent lineages. While the vast majority of IESs are single copy in present-day genomes, we identified several families of mobile IESs, including nonautonomous elements acquired via horizontal transfer, which generated tens to thousands of new copies. These observations provide the first direct evidence that transposable elements can account for the massive proliferation of IESs in Paramecium. The comparison of IESs of different evolutionary ages indicates that, over time, IESs shorten and diverge rapidly in sequence while they acquire features that allow them to be more efficiently excised. We nevertheless identified rare cases of IESs that are under strong purifying selection across the aurelia clade. The cases examined contain or overlap cellular genes that are inactivated by excision during development, suggesting conserved regulatory mechanisms. Similar to the evolution of introns in eukaryotes, the evolution of Paramecium IESs highlights the major role played by selfish genetic elements in shaping the complexity of genome architecture and gene expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Transposable Elements
  • Evolution, Molecular
  • Exons*
  • Genome, Protozoan*
  • Germ Cells*
  • Paramecium tetraurelia / genetics*
  • Protozoan Proteins / genetics*

Substances

  • DNA Transposable Elements
  • Protozoan Proteins

Grants and funding

This work was supported by the Centre National de la Recherche Scientifique (https://cnrs.fr), by the Agence Nationale de la Recherche (https://anr.fr) (ANR-18-CE12-0005 to EM, LD, SD; ANR-19-CE12-0015 to SD, OA). It received support under the program “Investissements d’Avenir” launched by the French Government and implemented by ANR with the references ANR-10-LABX-54 MEMOLIFE and ANR-10-IDEX-0001-02 PSL Research to EM. It was supported by the Fondation de la Recherche Medicale (https://don.frm.org)(Equipe FRM DEQ20160334868) to SD and by Labex Who Am I? (http://www.labex-whoami.org/fr)(ANR-11-LABX-0071) "Initiatives d’excellence" (Idex ANR-11-IDEX-0005-02) to SD. The sequencing effort was funded by France Génomique (https://www.france-genomique.org) through involvement of the technical facilities of Genoscope (ANR-10-INBS-09-08) to SD. We acknowledge the ImagoSeine facility, member of the France BioImaging infrastructure supported by the ANR-10-INSB-04. DS and FG received a salary from Agence Nationale de la Recherche. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.