Global analysis of more than 50,000 SARS-CoV-2 genomes reveals epistasis between eight viral genes

Proc Natl Acad Sci U S A. 2020 Dec 8;117(49):31519-31526. doi: 10.1073/pnas.2012331117. Epub 2020 Nov 17.

Abstract

Genome-wide epistasis analysis is a powerful tool to infer gene interactions, which can guide drug and vaccine development and lead to deeper understanding of microbial pathogenesis. We have considered all complete severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes deposited in the Global Initiative on Sharing All Influenza Data (GISAID) repository until four different cutoff dates, and used direct coupling analysis together with an assumption of quasi-linkage equilibrium to infer epistatic contributions to fitness from polymorphic loci. We find eight interactions, of which three are between pairs where one locus lies in gene ORF3a, both loci holding nonsynonymous mutations. We also find interactions between two loci in gene nsp13, both holding nonsynonymous mutations, and four interactions involving one locus holding a synonymous mutation. Altogether, we infer interactions between loci in viral genes ORF3a and nsp2, nsp12, and nsp6, between ORF8 and nsp4, and between loci in genes nsp2, nsp13, and nsp14. The paper opens the prospect to use prominent epistatically linked pairs as a starting point to search for combinatorial weaknesses of recombinant viral pathogens.

Keywords: SARS-CoV-2; direct coupling analysis; epistasis; recombination.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / pathology
  • Coronavirus Nucleocapsid Proteins / genetics
  • Coronavirus RNA-Dependent RNA Polymerase / genetics
  • Epistasis, Genetic / genetics*
  • Exoribonucleases / genetics
  • Genes, Viral / genetics*
  • Genome, Viral / genetics
  • Humans
  • Methyltransferases / genetics
  • RNA Helicases / genetics
  • SARS-CoV-2 / genetics*
  • Selection, Genetic / genetics
  • Viral Nonstructural Proteins / genetics
  • Viral Proteins / genetics
  • Viroporin Proteins / genetics

Substances

  • Coronavirus Nucleocapsid Proteins
  • NSP4 protein, SARS-CoV-2
  • NSP6 protein, SARS-CoV-2
  • ORF3a protein, SARS-CoV-2
  • ORF8 protein, SARS-CoV-2
  • Viral Nonstructural Proteins
  • Viral Proteins
  • Viroporin Proteins
  • nsp2 protein, SARS-CoV-2
  • Methyltransferases
  • Nsp13 protein, SARS-CoV
  • nsp14 protein, SARS coronavirus
  • Coronavirus RNA-Dependent RNA Polymerase
  • NSP12 protein, SARS-CoV-2
  • Exoribonucleases
  • RNA Helicases