Identifying New COVID-19 Variants from Spike Proteins Using Novelty Detection

Stud Health Technol Inform. 2022 Jun 6:290:694-698. doi: 10.3233/SHTI220167.

Abstract

The COVID-19 pandemic has caused millions of infections and deaths worldwide in an ongoing pandemic. With the passage of time, several variants of this virus have surfaced. Machine learning methods and algorithms have been very useful in understanding the virus and its implications so far. In this paper, we have studied a set of novelty detection algorithms and applied it to the problem of detecting COVID-19 variants. Our results show accuracies of 79.64% and 82.43% on the B.1.1.7 and B.1.351 variants respectively on ProtVec unaligned COVID-19 spike protein sequences using One Class SVM with fine-tuned parameters. We believe that a system for automated and timely detection of variants will help countries formulate mitigation measures and study remedies in terms of medicines and vaccines that can protect against the new variants.

Keywords: Coronavirus [B04.820.578.500.540.150]; Machine Learning [L01.224.050.375.530]; Proteins [D12.776].

MeSH terms

  • COVID-19*
  • Humans
  • Pandemics / prevention & control
  • SARS-CoV-2*
  • Spike Glycoprotein, Coronavirus / metabolism

Substances

  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants