Heptad stereotypy, S/Q layering, and remote origin of the SARS-CoV-2 fusion core

Virus Evol. 2021 Dec 15;7(2):veab097. doi: 10.1093/ve/veab097. eCollection 2022 Jan.

Abstract

The fusion of the SARS-CoV-2 virus with cells, a key event in the pathogenesis of Covid-19, depends on the assembly of a six-helix fusion core (FC) formed by portions of the spike protein heptad repeats (HRs) 1 and 2. Despite the critical role in regulating infectivity, its distinctive features, origin, and evolution are scarcely understood. Thus, we undertook a structure-guided positional and compositional analysis of the SARS-CoV-2 FC, in comparison with FCs of related viruses, tracing its origin and ongoing evolution. We found that clustered amino acid substitutions within HR1, distinguishing SARS-CoV-2 from SARS-CoV-1, enhance local heptad stereotypy and increase sharply the FC serine-to-glutamine (S/Q) ratio, determining a neat alternate layering of S-rich and Q-rich subdomains along the post-fusion structure. Strikingly, SARS-CoV-2 ranks among viruses with the highest FC S/Q ratio, together with highly syncytiogenic respiratory pathogens (RSV, NDV), whereas MERS-Cov, HIV, and Ebola viruses display low ratios, and this feature reflects onto S/Q segregation and H-bonding patterns. Our evolutionary analyses revealed that the SARS-CoV-2 FC occurs in other SARS-CoV-1-like Sarbecoviruses identified since 2005 in Hong Kong and adjacent regions, tracing its origin to >50 years ago with a recombination-driven spread. Finally, current mutational trends show that the FC is varying especially in the FC1 evolutionary hotspot. These findings establish a novel analytical framework illuminating the sequence/structure evolution of the SARS-CoV-2 FC, tracing its long history within Sarbecoviruses, and may help rationalize the evolution of the fusion machinery in emerging pathogens and the design of novel therapeutic fusion inhibitors.

Keywords: Covid-19; SARS-CoV-2 origin; SARS-CoV-2 spike protein; coiled-coil 6-helix bundle fusion core; heptad stereotypy; serine-rich.