The Evolving Faces of the SARS-CoV-2 Genome

Viruses. 2021 Sep 3;13(9):1764. doi: 10.3390/v13091764.

Abstract

Surveillance of the evolving SARS-CoV-2 genome combined with epidemiological monitoring and emerging vaccination became paramount tasks to control the pandemic which is rapidly changing in time and space. Genomic surveillance must combine generation and sharing sequence data with appropriate bioinformatics monitoring and analysis methods. We applied molecular portrayal using self-organizing maps machine learning (SOM portrayal) to characterize the diversity of the virus genomes, their mutual relatedness and development since the beginning of the pandemic. The genetic landscape obtained visualizes the relevant mutations in a lineage-specific fashion and provides developmental paths in genetic state space from early lineages towards the variants of concern alpha, beta, gamma and delta. The different genes of the virus have specific footprints in the landscape reflecting their biological impact. SOM portrayal provides a novel option for 'bioinformatics surveillance' of the pandemic, with strong odds regarding visualization, intuitive perception and 'personalization' of the mutational patterns of the virus genomes.

Keywords: COVID-19; SARS-CoV-2 lineages genomic surveillance; machine learning; self-organizing maps portrayal; single nucleotide variants; virus sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / epidemiology
  • COVID-19 / virology*
  • Computational Biology
  • Evolution, Molecular*
  • Genetic Variation*
  • Genome, Viral*
  • Genomics / methods
  • Humans
  • Incidence
  • Mutation
  • Pandemics
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • SARS-CoV-2 / classification
  • SARS-CoV-2 / genetics*