A database for retrieving information on SARS-CoV-2 S protein mutations based on correlation network analysis

BMC Genom Data. 2022 May 4;23(1):34. doi: 10.1186/s12863-022-01052-y.

Abstract

Background: Over a million genomes and mutational analyses of SARS-CoV-2 are available in public databases, which reveal the phylogenetic tree of the virus. Although these data have enabled scientists to closely track the evolution and transmission dynamics of the virus at global and local scales, the Mu variant, recently identified in infections in South America, shows an unusual combination of mutations, and it is difficult to visualize these atypical characteristics in public databases based on a phylogenetic tree.

Results: The Vcorn SARS-CoV-2 database was constructed to provide information on COVID-19 infections and mutations in the S protein of the virus based on correlation network analysis. A correlation network was constructed using the recall index of one mutation to another mutation. The network includes several network modules in which nodes represent mutations and are tightly connected to each other. Individual network modules contain mutations of single variants, such as the alpha and delta variants. In the network constructed to emphasize mutations of the Mu variant using the database, the mutations were found to be located in multiple network modules, indicating that the mutations of the variant may have originated from multiple variants or be located at a basal position with a high frequency of mutation.

Conclusions: Vcorn SARS-CoV-2 provides information on COVID-19 and S protein mutations of SARS-CoV-2 via correlation network analysis. The network based on the analysis illustrates the unusual S protein mutations of the Mu variant. The database is freely available at http://www.plant.osakafu-u.ac.jp/~kagiana/vcorn/sarscov2/ .

Keywords: Correlation network; Database; Mu variant; Mutation; SARS-CoV-2.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19* / genetics
  • Humans
  • Mutation
  • Phylogeny
  • SARS-CoV-2* / genetics
  • Spike Glycoprotein, Coronavirus

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants