Microsatellite signature analysis of twenty-one virophage genomes of the family Lavidaviridae

Gene. 2023 Jan 30:851:147037. doi: 10.1016/j.gene.2022.147037. Epub 2022 Nov 8.

Abstract

Microsatellites or Simple Sequence Repeats (SSRs) are short motif repeat sequences constituting the most hypervariable regions of genomes. Present study extracts and analyzes the SSRs from genomes of 21 virophages. Genomic sequences were retrieved from NCBI and the microsatellite data was extracted through MISA web server. Phylogenetic analysis was performed by using MAFFT and MEGAX as per standardized protocols. The virophages have a circular/linear ds DNA genome of ~17-30 kb size. The GC% of genomes ranged from 26.8 (PSAV13) to 51.1 (PSAV12). A total of 3664 SSRs and 488 cSSR were observed with an average incidence of 174 and 23 respectively. The total SSR incidence in a genome ranged from 120 (PSAV19) to 264 (PSAV14). The cSSR (compound SSR) incidence ranged from 8 (PSAV12) to 47 (PSAV14). Mono-nucleotide repeats are the most incident microsatellites (1129 SSRs) followed by di-nucleotide (1036 SSRs) and tri-nucleotide repeats (368 SSRs). However, the same is not true for individual genomes. There are 14, 16 and 17 genomes which have no incidence of tetra-, penta- and hexa-nucleotide repeats respectively. Mono 'A' repeats having the maximum representation (average ~33 per genome) in mono-nucleotide repeats. For the di-nucleotide repeats, AT/TA motif had the highest frequency (average ~30) distantly followed by AG/GA; and CT/TC (average 5.6 & 5.5 respectively). A total of 1946 SSRs (76%) were found in the coding region. All genomes had a higher SSR density in non-coding as compared to the coding region. There are fifteen genomes which have at least one gene with no SSR. A total of 41 cSSRs with incidence across minimum of two virophages was observed. There were 12 cSSRs which had multiple presence within the same genome. The heat map of the genomes on one hand corroborates the phylogenetic tree with similar sequences (PSAV2, PSAV5, PSAV6, PSAV17 and PSAV18) being positioned together in the phylogenetic analysis while on the other hand it also highlights the diversity of the studied sequences. The conservation of cSSRs across multiple virophages highlights their potential as biomarkers.

Keywords: Distribution; Incidence; Microsatellites; Satellite virus; Virophages.

MeSH terms

  • Genome, Viral
  • Microsatellite Repeats / genetics
  • Phylogeny
  • Virophages* / genetics
  • Viruses* / genetics