Mutation Patterns of Human SARS-CoV-2 and Bat RaTG13 Coronavirus Genomes Are Strongly Biased Towards C>U Transitions, Indicating Rapid Evolution in Their Hosts

Genes (Basel). 2020 Jul 7;11(7):761. doi: 10.3390/genes11070761.

Abstract

The pandemic caused by the spread of SARS-CoV-2 has led to considerable interest in its evolutionary origin and genome structure. Here, we analyzed mutation patterns in 34 human SARS-CoV-2 isolates and a closely related RaTG13 isolated from Rhinolophus affinis (a horseshoe bat). We also evaluated the CpG dinucleotide contents in SARS-CoV-2 and other human and animal coronavirus genomes. Out of 1136 single nucleotide variations (~4% divergence) between human SARS-CoV-2 and bat RaTG13, 682 (60%) can be attributed to C>U and U>C substitutions, far exceeding other types of substitutions. An accumulation of C>U mutations was also observed in SARS-CoV2 variants that arose within the human population. Globally, the C>U substitutions increased the frequency of codons for hydrophobic amino acids in SARS-CoV-2 peptides, while U>C substitutions decreased it. In contrast to most other coronaviruses, both SARS-CoV-2 and RaTG13 exhibited CpG depletion in their genomes. The data suggest that C-to-U conversion mediated by C deamination played a significant role in the evolution of the SARS-CoV-2 coronavirus. We hypothesize that the high frequency C>U transitions reflect virus adaptation processes in their hosts, and that SARS-CoV-2 could have been evolving for a relatively long period in humans following the transfer from animals before spreading worldwide.

Keywords: CpG depletion; SARS-CoV-2; coronavirus; cytosine deamination; evolution; mutation bias.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Betacoronavirus / classification
  • Betacoronavirus / genetics*
  • Betacoronavirus / isolation & purification
  • Chiroptera / virology
  • CpG Islands
  • Cytosine / metabolism*
  • Evolution, Molecular*
  • Humans
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • SARS-CoV-2
  • Severe acute respiratory syndrome-related coronavirus / classification
  • Severe acute respiratory syndrome-related coronavirus / genetics*
  • Severe acute respiratory syndrome-related coronavirus / isolation & purification
  • Spike Glycoprotein, Coronavirus / genetics
  • Uracil / metabolism*

Substances

  • Spike Glycoprotein, Coronavirus
  • Uracil
  • Cytosine