A sequence-based analysis of the pointer distribution of stichotrichous ciliates

Biosystems. 2010 Aug;101(2):109-16. doi: 10.1016/j.biosystems.2010.05.003. Epub 2010 May 27.

Abstract

Micronuclear genes in stichotrichous ciliates are broken into blocks separated by noncoding sequences, sometimes with the blocks in a shuffled order, some even inverted. During reproduction, all blocks are assembled in the correct order and orientation. This process is possible due to the special structure of micronuclear genes: each coding block M ends with a short nucleotide sequence (called pointer) that is repeated at the beginning of the coding block that should follow M in the assembled gene. Many of the pointers have multiple occurrences along both strands of the gene. This yields a very high number of pointer-induced possible divisions into coding and noncoding blocks. We investigate the distribution of pointers for all currently sequenced micronuclear ciliate genes with the goal of identifying what distinguishes the real gene structure among all possible coding/noncoding divisions. We find a sharp criterion in the average a/t-content of the noncoding blocks: the real division has, in most cases, the maximum such content among all possible combinations. Even for pointers as short as two nucleotides, the real division is one of very few with an average a/t-content of its noncoding blocks over 80%. The separation is most clear when the loci of pointers of up to four nucleotides (even three in the case of unscrambled genes) are fixed (e.g., through a template-based recombination mechanism).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Actins / genetics*
  • Algorithms
  • Amino Acyl-tRNA Synthetases / genetics*
  • Animals
  • Base Composition
  • Base Sequence
  • Ciliophora / genetics*
  • Computational Biology
  • DNA Polymerase I / genetics*
  • Micronucleus, Germline / genetics*
  • Models, Genetic*
  • Molecular Sequence Data
  • Species Specificity
  • Telomere-Binding Proteins / genetics*
  • Terminal Repeat Sequences / genetics

Substances

  • Actins
  • Telomere-Binding Proteins
  • DNA Polymerase I
  • Amino Acyl-tRNA Synthetases