Genomic Analysis of Non-B Nucleic Acids Structures in SARS-CoV-2: Potential Key Roles for These Structures in Mutability, Translation, and Replication?

Genes (Basel). 2023 Jan 6;14(1):157. doi: 10.3390/genes14010157.

Abstract

Non-B nucleic acids structures have arisen as key contributors to genetic variation in SARS-CoV-2. Herein, we investigated the presence of defining spike protein mutations falling within inverted repeats (IRs) for 18 SARS-CoV-2 variants, discussed the potential roles of G-quadruplexes (G4s) in SARS-CoV-2 biology, and identified potential pseudoknots within the SARS-CoV-2 genome. Surprisingly, there was a large variation in the number of defining spike protein mutations arising within IRs between variants and these were more likely to occur in the stem region of the predicted hairpin stem-loop secondary structure. Notably, mutations implicated in ACE2 binding and propagation (e.g., ΔH69/V70, N501Y, and D614G) were likely to occur within IRs, whilst mutations involved in antibody neutralization and reduced vaccine efficacy (e.g., T19R, ΔE156, ΔF157, R158G, and G446S) were rarely found within IRs. We also predicted that RNA pseudoknots could predominantly be found within, or next to, 29 mutations found in the SARS-CoV-2 spike protein. Finally, the Omicron variants BA.2, BA.4, BA.5, BA.2.12.1, and BA.2.75 appear to have lost two of the predicted G4-forming sequences found in other variants. These were found in nsp2 and the sequence complementary to the conserved stem-loop II-like motif (S2M) in the 3' untranslated region (UTR). Taken together, non-B nucleic acids structures likely play an integral role in SARS-CoV-2 evolution and genetic diversity.

Keywords: G-quadruplex; SARS-CoV-2; adaptation; inverted repeats; mutation; pseudoknot; spike protein.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions
  • COVID-19* / genetics
  • Genomics
  • Humans
  • Nucleic Acids*
  • SARS-CoV-2 / genetics
  • Spike Glycoprotein, Coronavirus / genetics

Substances

  • spike protein, SARS-CoV-2
  • Spike Glycoprotein, Coronavirus
  • Nucleic Acids
  • 3' Untranslated Regions

Supplementary concepts

  • SARS-CoV-2 variants

Grants and funding

This research was funded by the Czech Science Foundation, grant number 22-21903S.