Spatio-temporal dynamics of intra-host variability in SARS-CoV-2 genomes

Nucleic Acids Res. 2022 Feb 22;50(3):1551-1561. doi: 10.1093/nar/gkab1297.

Abstract

During the course of the COVID-19 pandemic, large-scale genome sequencing of SARS-CoV-2 has been useful in tracking its spread and in identifying variants of concern (VOC). Viral and host factors could contribute to variability within a host that can be captured in next-generation sequencing reads as intra-host single nucleotide variations (iSNVs). Analysing 1347 samples collected till June 2020, we recorded 16 410 iSNV sites throughout the SARS-CoV-2 genome. We found ∼42% of the iSNV sites to be reported as SNVs by 30 September 2020 in consensus sequences submitted to GISAID, which increased to ∼80% by 30th June 2021. Following this, analysis of another set of 1774 samples sequenced in India between November 2020 and May 2021 revealed that majority of the Delta (B.1.617.2) and Kappa (B.1.617.1) lineage-defining variations appeared as iSNVs before getting fixed in the population. Besides, mutations in RdRp as well as RNA-editing by APOBEC and ADAR deaminases seem to contribute to the differential prevalence of iSNVs in hosts. We also observe hyper-variability at functionally critical residues in Spike protein that could alter the antigenicity and may contribute to immune escape. Thus, tracking and functional annotation of iSNVs in ongoing genome surveillance programs could be important for early identification of potential variants of concern and actionable interventions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • APOBEC-1 Deaminase / genetics
  • Adenosine Deaminase / genetics
  • Animals
  • COVID-19 / epidemiology
  • COVID-19 / prevention & control
  • COVID-19 / virology
  • Chlorocebus aethiops
  • Coronavirus RNA-Dependent RNA Polymerase / genetics
  • Databases, Genetic
  • Evolution, Molecular*
  • Genetic Variation / genetics*
  • Genome, Viral / genetics*
  • Host-Pathogen Interactions / genetics*
  • Immune Evasion / genetics
  • India / epidemiology
  • Phylogeny
  • RNA-Binding Proteins / genetics
  • SARS-CoV-2 / classification
  • SARS-CoV-2 / genetics*
  • SARS-CoV-2 / growth & development
  • Spike Glycoprotein, Coronavirus / genetics
  • Vero Cells

Substances

  • RNA-Binding Proteins
  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2
  • Coronavirus RNA-Dependent RNA Polymerase
  • APOBEC-1 Deaminase
  • ADAR protein, human
  • Adenosine Deaminase

Supplementary concepts

  • SARS-CoV-2 variants