Cross-species analysis of genic GC3 content and DNA methylation patterns

Genome Biol Evol. 2013;5(8):1443-56. doi: 10.1093/gbe/evt103.

Abstract

The GC content in the third codon position (GC(3)) exhibits a unimodal distribution in many plant and animal genomes. Interestingly, grasses and homeotherm vertebrates exhibit a unique bimodal distribution. High GC(3) was previously found to be associated with variable expression, higher frequency of upstream TATA boxes, and an increase of GC(3) from 5' to 3'. Moreover, GC(3)-rich genes are predominant in certain gene classes and are enriched in CpG dinucleotides that are potential targets for methylation. Based on the GC(3) bimodal distribution we hypothesize that GC(3) has a regulatory role involving methylation and gene expression. To test that hypothesis, we selected diverse taxa (rice, thale cress, bee, and human) that varied in the modality of their GC(3) distribution and tested the association between GC(3), DNA methylation, and gene expression. We examine the relationship between cytosine methylation levels and GC(3), gene expression, genome signature, gene length, and other gene compositional features. We find a strong negative correlation (Pearson's correlation coefficient r = -0.67, P value < 0.0001) between GC(3) and genic CpG methylation. The comparison between 5'-3' gradients of CG(3)-skew and genic methylation for the taxa in the study suggests interplay between gene-body methylation and transcription-coupled cytosine deamination effect. Compositional features are correlated with methylation levels of genes in rice, thale cress, human, bee, and fruit fly (which acts as an unmethylated control). These patterns allow us to generate evolutionary hypotheses about the relationships between GC(3) and methylation and how these affect expression patterns. Specifically, we propose that the opposite effects of methylation and compositional gradients along coding regions of GC(3)-poor and GC(3)-rich genes are the products of several competing processes.

Keywords: Apis mellifera; Arabidopsis thaliana; DNA methylation; GC3; Homo sapiens; Oryza sativa; gene expression; grasses; homeotherms.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Arabidopsis / genetics
  • Base Composition / genetics*
  • Bees / genetics*
  • DNA Methylation*
  • Drosophila melanogaster / genetics*
  • Evolution, Molecular*
  • Gene Expression
  • Genes, Plant
  • Humans
  • Models, Genetic
  • Oryza / genetics
  • Plants / genetics*
  • Species Specificity