Excessive G-U transversions in novel allele variants in SARS-CoV-2 genomes

PeerJ. 2020 Jul 28:8:e9648. doi: 10.7717/peerj.9648. eCollection 2020.

Abstract

Background: SARS-CoV-2 is a novel coronavirus that causes COVID-19 infection, with a closest known relative found in bats. For this virus, hundreds of genomes have been sequenced. This data provides insights into SARS-CoV-2 adaptations, determinants of pathogenicity and mutation patterns. A comparison between patterns of mutations that occurred before and after SARS-CoV-2 jumped to human hosts may reveal important evolutionary consequences of zoonotic transmission.

Methods: We used publically available complete genomes of SARS-CoV-2 to calculate relative frequencies of single nucleotide variations. These frequencies were compared with relative substitutions frequencies between SARS-CoV-2 and related animal coronaviruses. A similar analysis was performed for human coronaviruses SARS-CoV and HKU1.

Results: We found a 9-fold excess of G-U transversions among SARS-CoV-2 mutations over relative substitution frequencies between SARS-CoV-2 and a close relative coronavirus from bats (RaTG13). This suggests that mutation patterns of SARS-CoV-2 have changed after transmission to humans. The excess of G-U transversions was much smaller in a similar analysis for SARS-CoV and non-existent for HKU1. Remarkably, we did not find a similar excess of complementary C-A mutations in SARS-CoV-2. We discuss possible explanations for these observations.

Keywords: Bioinformatics; COVID-19; Evolution; Mutagenesis; Mutations; SARS-CoV-2; Transversions.

Grants and funding

This work was supported by the Russian Foundation for Basic Research grant RFBR 18-29-13014 mk. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.