Co-Mutations and Possible Variation Tendency of the Spike RBD and Membrane Protein in SARS-CoV-2 by Machine Learning

Int J Mol Sci. 2024 Apr 25;25(9):4662. doi: 10.3390/ijms25094662.

Abstract

Since the onset of the coronavirus disease 2019 (COVID-19) pandemic, SARS-CoV-2 variants capable of breakthrough infections have attracted global attention. These variants have significant mutations in the receptor-binding domain (RBD) of the spike protein and the membrane (M) protein, which may imply an enhanced ability to evade immune responses. In this study, an examination of co-mutations within the spike RBD and their potential correlation with mutations in the M protein was conducted. The EVmutation method was utilized to analyze the distribution of the mutations to elucidate the relationship between the mutations in the spike RBD and the alterations in the M protein. Additionally, the Sequence-to-Sequence Transformer Model (S2STM) was employed to establish mapping between the amino acid sequences of the spike RBD and M proteins, offering a novel and efficient approach for streamlined sequence analysis and the exploration of their interrelationship. Certain mutations in the spike RBD, G339D-S373P-S375F and Q493R-Q498R-Y505, are associated with a heightened propensity for inducing mutations at specific sites within the M protein, especially sites 3 and 19/63. These results shed light on the concept of mutational synergy between the spike RBD and M proteins, illuminating a potential mechanism that could be driving the evolution of SARS-CoV-2.

Keywords: SARS-CoV-2; co-mutations; mutational synergy; sequence analysis; sequence-to-sequence transformer model.

MeSH terms

  • Amino Acid Sequence
  • COVID-19* / genetics
  • COVID-19* / virology
  • Coronavirus M Proteins / genetics
  • Humans
  • Machine Learning*
  • Mutation*
  • Protein Binding
  • Protein Domains / genetics
  • SARS-CoV-2* / genetics
  • SARS-CoV-2* / metabolism
  • Spike Glycoprotein, Coronavirus* / chemistry
  • Spike Glycoprotein, Coronavirus* / genetics
  • Viral Matrix Proteins / chemistry
  • Viral Matrix Proteins / genetics

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2
  • Viral Matrix Proteins
  • Coronavirus M Proteins
  • membrane protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants