Evidence for Strong Mutation Bias toward, and Selection against, U Content in SARS-CoV-2: Implications for Vaccine Design

Mol Biol Evol. 2021 Jan 4;38(1):67-83. doi: 10.1093/molbev/msaa188.

Abstract

Large-scale re-engineering of synonymous sites is a promising strategy to generate vaccines either through synthesis of attenuated viruses or via codon-optimized genes in DNA vaccines. Attenuation typically relies on deoptimization of codon pairs and maximization of CpG dinucleotide frequencies. So as to formulate evolutionarily informed attenuation strategies that aim to force nucleotide usage against the direction favored by selection, here, we examine available whole-genome sequences of SARS-CoV-2 to infer patterns of mutation and selection on synonymous sites. Analysis of mutational profiles indicates a strong mutation bias toward U. In turn, analysis of observed synonymous site composition implicates selection against U. Accounting for dinucleotide effects reinforces this conclusion, observed UU content being a quarter of that expected under neutrality. Possible mechanisms of selection against U mutations include selection for higher expression, for high mRNA stability or lower immunogenicity of viral genes. Consistent with gene-specific selection against CpG dinucleotides, we observe systematic differences of CpG content between SARS-CoV-2 genes. We propose an evolutionarily informed approach to attenuation that, unusually, seeks to increase usage of the already most common synonymous codons. Comparable analysis of H1N1 and Ebola finds that GC3 deviated from neutral equilibrium is not a universal feature, cautioning against generalization of results.

Keywords: SARS-CoV-2; mutation equilibrium; selection; synonymous mutations; vaccine design; viral attenuation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / genetics*
  • COVID-19 / prevention & control
  • COVID-19 Vaccines / genetics*
  • Genome, Viral*
  • Humans
  • Mutation*
  • RNA Stability / genetics
  • RNA, Messenger / genetics
  • RNA, Viral / genetics
  • SARS-CoV-2 / genetics*
  • Selection, Genetic*
  • Uracil

Substances

  • COVID-19 Vaccines
  • RNA, Messenger
  • RNA, Viral
  • Uracil