Long-term dynamic shifts in genomic base content and evolutionary trajectories of SARS-CoV-2 variants

J Med Virol. 2023 Oct;95(10):e29128. doi: 10.1002/jmv.29128.

Abstract

The rapid spread and remarkable mutations of SARS-CoV-2 variants, particularly Omicron, necessitate an understanding of their evolutionary characteristics. In this study, we analyzed representative high-quality whole-genome sequences of 2008 SARS-CoV-2 variants to explore long-term dynamic changes in genomic base (especially GC) content and variations during viral evolution. Our results demonstrated a highly negative correlation between GC content and variant emergence time (r = -0.765, p < 2.22e-16). Major gene partitions (S, N, ORF1ab) displayed similar trends. Omicron exhibited a significantly lower GC content than non-Omicron variants (p < 2.22e-16). Notably, we observed a robust negative correlation between C and T content (r = -0.778, p < 2.22e-16) and between G and A content (r = -0.773, p < 2.22e-16). Among all strains, Omicron showed the greatest base variation, with C->T mutations being the most frequent (median [interquartile range [IQR]]: 29 (27, 31), 37.67%), succeeded by G->A mutations (11 (9, 13), 14.63%). Over a 3-year span, an annual decline rate of 0.12% in SARS-CoV-2 GC content was observed and could become more pronounced in future emerging variants. These findings provided insights into the evolutionary trajectory of SARS-CoV-2, underscoring the significance of continuous genomic surveillance for effective prediction of and response to future variants.

Keywords: COVID-19; Omicron subvariants; SARS-CoV-2; genomic base content; viral evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / epidemiology
  • Genomics
  • Humans
  • Mutation
  • SARS-CoV-2* / genetics

Supplementary concepts

  • SARS-CoV-2 variants