Subtelomere organization in the genome of the microsporidian Encephalitozoon cuniculi: patterns of repeated sequences and physicochemical signatures

BMC Genomics. 2016 Jan 7:17:34. doi: 10.1186/s12864-015-1920-7.

Abstract

Background: The microsporidian Encephalitozoon cuniculi is an obligate intracellular eukaryotic pathogen with a small nuclear genome (2.9 Mbp) consisting of 11 chromosomes. Although each chromosome end is known to contain a single rDNA unit, the incomplete assembly of subtelomeric regions following sequencing of the genome identified only 3 of the 22 expected rDNA units. While chromosome end assembly remains a difficult process in most eukaryotic genomes, it is of significant importance for pathogens because these regions encode factors important for virulence and host evasion.

Results: Here we report the first complete assembly of E. cuniculi chromosome ends, and describe a novel mosaic structure of segmental duplications (EXT repeats) in these regions. EXT repeats range in size between 3.5 and 23.8 kbp and contain four multigene families encoding membrane associated proteins. Twenty-one recombination sites were identified in the sub-terminal region of E. cuniculi chromosomes. Our analysis suggests that these sites contribute to the diversity of chromosome ends organization through Double Strand Break repair mechanisms. The region containing EXT repeats at chromosome extremities can be differentiated based on gene composition, GC content, recombination sites density and chromosome landscape.

Conclusion: Together this study provides the complete structure of the chromosome ends of E. cuniculi GB-M1, and identifies important factors, which could play a major role in parasite diversity and host-parasite interactions. Comparison with other eukaryotic genomes suggests that terminal regions could be distinguished precisely based on gene content, genetic instability and base composition biais. The diversity of processes assciated with chromosome extremities and their biological consequences, as they are presented in the present study, emphasize the fact that great effort will be necessary in the future to characterize more carefully these regions during whole genome sequencing efforts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition
  • DNA, Protozoan / genetics
  • Encephalitozoon cuniculi / genetics*
  • Genome
  • Host-Parasite Interactions / genetics*
  • Multigene Family / genetics
  • Repetitive Sequences, Nucleic Acid / genetics*
  • Telomere / genetics*

Substances

  • DNA, Protozoan