Geometric analysis of SARS-CoV-2 variants

Gene. 2024 May 30:909:148291. doi: 10.1016/j.gene.2024.148291. Epub 2024 Feb 28.

Abstract

SARS-CoV-2 as a severe respiratory disease has been prevalent around the world since its first discovery in 2019.As a single-stranded RNA virus, its high mutation rate makes its variants manifold and enables some of them to have high pathogenicity, such as Omicron variant, the most prevalent virus now. Research on the relationship of these SARS-CoV-2 variants, especially exploring their difference is a hot issue. In this study, we constructed a geometric space to represent all SARS-CoV-2 sequences of different variants. An alignment-free method: natural vector method was utilized to establish genome space. The genome space of SARS-CoV-2 was constructed based on the 24-dimensional natural vector and the appropriate metric was determined through performing phylogenetic analysises. Phylogenetic trees of different lineages constructed under the selected natural vector and metric coincided with the lineage naming standards, which means lineages with same alphabetical prefix cluster in phylogenetic trees. Furthermore, the relationships between the various GISAID clades as depicted by the natural graph primarily matched the description provided in the GISAID clade naming.The validity of our geometric space was demonstrated by these phylogenetic analysis results. So in this research, we constructed a geometry space for the genomes of the novel coronavirus SARS-CoV-2, which allows us to compare the different variants. Our geometric space is valuable for resolving the issues insides the virus.

Keywords: Convex Hull Classification; Genome Space; Natural vector; Phylogenetic Analysis; SARS-CoV-2; The Nearest Neighbor Classification.

MeSH terms

  • COVID-19*
  • Humans
  • Mutation Rate
  • Phylogeny
  • SARS-CoV-2* / genetics

Supplementary concepts

  • SARS-CoV-2 variants