European context of the diversity and phylogenetic position of SARS-CoV-2 sequences from Polish COVID-19 patients

J Appl Genet. 2021 May;62(2):327-337. doi: 10.1007/s13353-020-00603-2. Epub 2021 Jan 5.

Abstract

To provide a comprehensive analysis of the SARS-CoV-2 sequence diversity in Poland in the European context. All publicly available (n = 115; GISAID database) whole-genome SARS-Cov-2 sequences from Polish samples, including those obtained during coronavirus testing performed in our COVID-19 Lab, were examined. Multiple sequence alignment of Polish isolates, phylogenetic analysis (ML tree), and multidimensional scaling (based on the pairwise DNA distances) were complemented by the comparison of the coronavirus clades frequency and diversity in the subset of over 5000 European GISAID sequences. Approximately seventy-seven percent of isolates in the European dataset carried frequent and ubiquitously found haplotypes; the remaining haplotype diversity was population-specific and resulted from population-specific mutations, homoplasies, and recombinations. Coronavirus strains circulating in Poland represented the variability found in other European countries. The prevalence of clades circulating in Poland was shifted in favor of GR, both in terms of the diversity (number of distinct haplotypes) and the frequency (number of isolates) of the clade. Polish-specific haplotypes were rare and could be explained by changes affecting common European strains. The analysis of the whole viral genomes allowed detection of several tight clusters of isolates, presumably reflecting local outbreaks. New mutations, homoplasies, and, to a smaller extent, recombinations increase SARS-CoV-2 haplotype diversity, but the majority of these variants do not increase in frequency and remains rare and population-specific. The spectrum of SARS-CoV-2 haplotypes in the Polish dataset reflects many independent transfers from a variety of sources, followed by many local outbreaks. The prevalence of the sequences belonging to the GR clade among Polish isolates is consistent with the European trend of the GR clade frequency increase.

Keywords: Coronavirus; Epidemiology; Haplotypes; Phylogenetics; Population; Whole RNA genome sequencing.

MeSH terms

  • Genetic Variation*
  • Genome, Viral*
  • Haplotypes
  • Humans
  • Mutation
  • Phylogeny*
  • Poland
  • RNA, Viral / genetics
  • SARS-CoV-2 / genetics*

Substances

  • RNA, Viral