Mutation profile of SARS-CoV-2 genome in a sample from the first year of the pandemic in Colombia

Infect Genet Evol. 2022 Jan:97:105192. doi: 10.1016/j.meegid.2021.105192. Epub 2021 Dec 18.

Abstract

The severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) is the etiopathogenic agent of COVID-19, a condition that has led to a formally recognized pandemic by March 2020 (World Health Organization -WHO). The SARS-CoV-2 genome is constituted of 29,903 base pairs, that code for four structural proteins (N, M, S, and E) and more than 20 non-structural proteins. Mutations in any of these regions, especially in those that encode for the structural proteins, have allowed the identification of diverse lineages around the world, some of them named as Variants of Concern (VOC) and Variants of Interest (VOI), according to the WHO and CDC. In this study, by using Next Generation Sequencing (NGS) technology, we sequenced the SARS-CoV-2 genome of 422 samples from Colombian residents, all of them collected between April 2020 and January 2021. We obtained genetic information from 386 samples, leading us to the identification of 14 new lineages circulating in Colombia, 13 of which were identified for the first time in South America. GH was the predominant GISAID clade in our sample. Most mutations were either missense (53.6%) or synonymous mutations (37.4%), and most genetic changes were located in the ORF1ab gene (63.9%), followed by the S gene (12.9%). In the latter, we identified mutations E484K, L18F, and D614G. Recent evidence suggests that these mutations concede important particularities to the virus, compromising host immunity, the diagnostic test performance, and the effectiveness of some vaccines. Some important lineages containing these mutations are the Alpha, Beta, and Gamma (WHO Label). Further genomic surveillance is important for the understanding of emerging genomic variants and their correlation with disease severity.

Keywords: Covid-19; Genetic variation; High-throughput nucleotide sequencing; SARS-CoV-2; SARS-CoV-2 variants; Whole genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • COVID-19 / epidemiology*
  • COVID-19 / transmission
  • COVID-19 / virology
  • Colombia / epidemiology
  • Epidemiological Monitoring
  • Evolution, Molecular
  • Gene Expression
  • Genome, Viral*
  • Humans
  • Mutation*
  • Phylogeny
  • Polyproteins / genetics
  • Polyproteins / metabolism
  • SARS-CoV-2 / classification
  • SARS-CoV-2 / genetics*
  • SARS-CoV-2 / pathogenicity
  • Spike Glycoprotein, Coronavirus / genetics*
  • Spike Glycoprotein, Coronavirus / metabolism
  • Time Factors
  • Viral Proteins / genetics*
  • Viral Proteins / metabolism
  • Whole Genome Sequencing

Substances

  • ORF1ab polyprotein, SARS-CoV-2
  • Polyproteins
  • Spike Glycoprotein, Coronavirus
  • Viral Proteins
  • spike protein, SARS-CoV-2

Supplementary concepts

  • SARS-CoV-2 variants