Whole genome comparison of Pakistani Corona virus with Chinese and US Strains along with its predictive severity of COVID-19

Gene Rep. 2021 Jun:23:101139. doi: 10.1016/j.genrep.2021.101139. Epub 2021 Apr 15.

Abstract

Initially submitted 784 SARS-nCoV2 whole genome sequences on NCBI Virus database were selected for phylogenetic analysis to look into their similarities with two of Pakistani sequenced coronavirus strains having accessions of MT240479 and MT262993. The MT240479 named (Gilgit1-Pak) was found in close proximity to MT184913 named (CruiseA-USA), while MT262993 named (Manga-Pak) was in neighboring to MT039887 named (WI-USA) strain, which were further chosen for variant calling analysis along with reference genome NC_045512 as out-group to construct concluding cladogram and looked for evolutionary distance with PAUP software in this article. Aforementioned Pakistani strains each of having 29,836 bases were compared with MT263429 (WI-USA) of 29,889 bases and MT259229 (Wuhan-P.R. China) of 29,864 bases. Whole genome variant calling pipeline revealed 31 variants in both Pakistani strains collectively (Manga-Pak vs USA having 2del & 7SNPs, while different from Chinese strain with 2del & 2SNPs, similarly Gilgit1-Pak vs USA having 10SNPs, while different from Chinese strains having 8SNPs). These variants harbour ORF1ab, ORF1a and N genes having their role is viral replication/translation, host innate immunity and viral capsid formation respectively. These novel variants may be one of the reasons for low mortality rate in Pakistan with 385 deaths as compared to USA with 63,871 and P.R. China with 4633 by May 01, 2020. However functional characterization of these variants and their integrations with other viral proteins including variability of human receptors (ACE2 & NRP1) may be the other reasons for unlikely COVID-19 statistics in Pakistan which need further confirmatory studies. Moreover, mutated N and ORF1a proteins in Pakistani strains were also analyzed by 3D structure modeling, which give another dimension of comparing these alterations at amino acid level. In a nutshell, these novel variants are correlated with reduced mortality of COVID-19 severity in Pakistan while more robust results can be obtained by wet lab experimentation. This also gives insight of genomic landscape of these indigenous strains to develop diagnostics kits, vaccines and therapeutic interventions.

Keywords: 3D structural modeling; BAM, binary alignment maps; DNA, deoxyribonucleic acid; NCBI, National Center for Biotechnology Information; Pakistani SARS-nCoV2; Phylogenetic analysis; SAM, sequence alignment maps; SARS-nCoV, severe acute respiratory syndrome novel coronavirus; Variant calling pipeline.