Bias at the third nucleotide of codon pairs in virus and host genomes

Sci Rep. 2022 Mar 16;12(1):4522. doi: 10.1038/s41598-022-08570-w.

Abstract

Genomes of different sizes and complexity can be compared using common features. Most genomes contain open reading frames, and most genomes use the same genetic code. Redundancy in the genetic code means that different biases in the third nucleotide position of a codon exist in different genomes. However, the nucleotide composition of viruses can be quite different from host nucleotide composition making it difficult to assess the relevance of these biases. Here we show that grouping codons of a codon-pair according to the GC content of the first two nucleotide positions of each codon reveals patterns in nucleotide usage at the third position of the 1st codon. Differences between the observed and expected biases occur predominantly when the first two nucleotides of the 2nd codon are both S (strong, G or C) or both W (weak, A or T), not a mixture of strong and weak. The data indicates that some codon pairs are preferred because of the strength of the interactions between the codon and anticodon, the adjacent tRNAs and the ribosome. Using base-pairing strength and third position bias facilitates the comparison of genomes of different size and nucleotide composition and reveals patterns not previously described.

MeSH terms

  • Bias
  • Codon / genetics
  • DNA Viruses / genetics
  • Genetic Code*
  • Nucleotides* / genetics

Substances

  • Codon
  • Nucleotides