The interplay of SARS-CoV-2 evolution and constraints imposed by the structure and functionality of its proteins

PLoS Comput Biol. 2021 Jul 8;17(7):e1009147. doi: 10.1371/journal.pcbi.1009147. eCollection 2021 Jul.

Abstract

The unprecedented pace of the sequencing of the SARS-CoV-2 virus genomes provides us with unique information about the genetic changes in a single pathogen during ongoing pandemic. By the analysis of close to 200,000 genomes we show that the patterns of the SARS-CoV-2 virus mutations along its genome are closely correlated with the structural and functional features of the encoded proteins. Requirements of foldability of proteins' 3D structures and the conservation of their key functional regions, such as protein-protein interaction interfaces, are the dominant factors driving evolutionary selection in protein-coding genes. At the same time, avoidance of the host immunity leads to the abundance of mutations in other regions, resulting in high variability of the missense mutation rate along the genome. "Unexplained" peaks and valleys in the mutation rate provide hints on function for yet uncharacterized genomic regions and specific protein structural and functional features they code for. Some of these observations have immediate practical implications for the selection of target regions for PCR-based COVID-19 tests and for evaluating the risk of mutations in epitopes targeted by specific antibodies and vaccine design strategies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biological Evolution*
  • Genes, Viral
  • Mutation
  • SARS-CoV-2 / genetics
  • SARS-CoV-2 / physiology*
  • Viral Proteins / physiology

Substances

  • Viral Proteins