SARS-CoV-2 host prediction based on virus-host genetic features

Sci Rep. 2022 Mar 17;12(1):4576. doi: 10.1038/s41598-022-08350-6.

Abstract

The genetic diversity of the Coronaviruses gives them different biological abilities, such as infect different cells and/or organisms, a wide spectrum of clinical manifestations, their different routes of dispersion, and viral transmission in a specific host. In recent decades, different Coronaviruses have emerged that are highly adapted for humans and causing serious diseases, leaving their host of unknown origin. The viral genome information is particularly important to enable the recognition of patterns linked to their biological characteristics, such as the specificity in the host-parasite relationship. Here, based on a previously computational tool, the Seq2Hosts, we developed a novel approach which uses new variables obtained from the frequency of spike-Coronaviruses codons, the Relative Synonymous Codon Usage (RSCU) to shed new light on the molecular mechanisms involved in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) host specificity. By using the RSCU obtained from nucleotide sequences before the SARS-CoV-2 pandemic, we assessed the possibility of know the hosts capable to be infected by these new emerging species, which was first identified infecting humans during 2019 in Wuhan, China. According to the model trained and validated using sequences available before the pandemic, bats are the most likely the natural host to the SARS-CoV-2 infection, as previously suggested in other studies that searched for the host viral origin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • COVID-19* / genetics
  • Chiroptera*
  • DNA Viruses
  • Genome, Viral
  • Humans
  • SARS-CoV-2 / genetics