Causes and Consequences of Purifying Selection on SARS-CoV-2

Genome Biol Evol. 2021 Oct 1;13(10):evab196. doi: 10.1093/gbe/evab196.

Abstract

Owing to a lag between a deleterious mutation's appearance and its selective removal, gold-standard methods for mutation rate estimation assume no meaningful loss of mutations between parents and offspring. Indeed, from analysis of closely related lineages, in SARS-CoV-2, the Ka/Ks ratio was previously estimated as 1.008, suggesting no within-host selection. By contrast, we find a higher number of observed SNPs at 4-fold degenerate sites than elsewhere and, allowing for the virus's complex mutational and compositional biases, estimate that the mutation rate is at least 49-67% higher than would be estimated based on the rate of appearance of variants in sampled genomes. Given the high Ka/Ks one might assume that the majority of such intrahost selection is the purging of nonsense mutations. However, we estimate that selection against nonsense mutations accounts for only ∼10% of all the "missing" mutations. Instead, classical protein-level selective filters (against chemically disparate amino acids and those predicted to disrupt protein functionality) account for many missing mutations. It is less obvious why for an intracellular parasite, amino acid cost parameters, notably amino acid decay rate, is also significant. Perhaps most surprisingly, we also find evidence for real-time selection against synonymous mutations that move codon usage away from that of humans. We conclude that there is common intrahost selection on SARS-CoV-2 that acts on nonsense, missense, and possibly synonymous mutations. This has implications for methods of mutation rate estimation, for determining times to common ancestry and the potential for intrahost evolution including vaccine escape.

Keywords: SARS-CoV-2; codon usage; mutation rate; purifying selection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / virology*
  • Codon Usage
  • Codon, Nonsense
  • Evolution, Molecular
  • Humans
  • Models, Genetic
  • Mutation Rate
  • Mutation*
  • Mutation, Missense
  • Polymorphism, Single Nucleotide
  • SARS-CoV-2 / genetics*
  • Selection, Genetic
  • Silent Mutation

Substances

  • Codon, Nonsense