Exploring the genomic and proteomic variations of SARS-CoV-2 spike glycoprotein: A computational biology approach

Infect Genet Evol. 2020 Oct:84:104389. doi: 10.1016/j.meegid.2020.104389. Epub 2020 Jun 2.

Abstract

The newly identified SARS-CoV-2 has now been reported from around 185 countries with more than a million confirmed human cases including more than 120,000 deaths. The genomes of SARS-COV-2 strains isolated from different parts of the world are now available and the unique features of constituent genes and proteins need to be explored to understand the biology of the virus. Spike glycoprotein is one of the major targets to be explored because of its role during the entry of coronaviruses into host cells. We analyzed 320 whole-genome sequences and 320 spike protein sequences of SARS-CoV-2 using multiple sequence alignment. In this study, 483 unique variations have been identified among the genomes of SARS-CoV-2 including 25 nonsynonymous mutations and one deletion in the spike (S) protein. Among the 26 variations detected in S, 12 variations were located at the N-terminal domain (NTD) and 6 variations at the receptor-binding domain (RBD) which might alter the interaction of S protein with the host receptor angiotensin-converting enzyme 2 (ACE2). Besides, 22 amino acid insertions were identified in the spike protein of SARS-CoV-2 in comparison with that of SARS-CoV. Phylogenetic analyses of spike protein revealed that Bat coronavirus have a close evolutionary relationship with circulating SARS-CoV-2. The genetic variation analysis data presented in this study can help a better understanding of SARS-CoV-2 pathogenesis. Based on results reported herein, potential inhibitors against S protein can be designed by considering these variations and their impact on protein structure.

Keywords: COVID-19; Genomic variants; SARS-CoV-2; Sequence analysis; Spike protein.

MeSH terms

  • Alphacoronavirus / classification
  • Alphacoronavirus / genetics*
  • Alphacoronavirus / metabolism
  • Angiotensin-Converting Enzyme 2
  • Animals
  • Base Sequence
  • Betacoronavirus / classification
  • Betacoronavirus / genetics*
  • Betacoronavirus / metabolism
  • Binding Sites
  • Chiroptera / virology
  • Gene Expression
  • Genome, Viral*
  • Humans
  • Models, Molecular
  • Mutation
  • Peptidyl-Dipeptidase A / chemistry*
  • Peptidyl-Dipeptidase A / genetics
  • Peptidyl-Dipeptidase A / metabolism
  • Protein Binding
  • Protein Conformation, alpha-Helical
  • Protein Conformation, beta-Strand
  • Protein Interaction Domains and Motifs
  • SARS-CoV-2
  • Sequence Alignment
  • Severe acute respiratory syndrome-related coronavirus / classification
  • Severe acute respiratory syndrome-related coronavirus / genetics*
  • Severe acute respiratory syndrome-related coronavirus / metabolism
  • Spike Glycoprotein, Coronavirus / chemistry*
  • Spike Glycoprotein, Coronavirus / genetics
  • Spike Glycoprotein, Coronavirus / metabolism
  • Structural Homology, Protein
  • Virus Attachment

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2
  • Peptidyl-Dipeptidase A
  • ACE2 protein, human
  • Angiotensin-Converting Enzyme 2