Recurrent Potential G-Quadruplex Sequences in Archaeal Genomes

Front Microbiol. 2021 Mar 24:12:647851. doi: 10.3389/fmicb.2021.647851. eCollection 2021.

Abstract

Evolutionary conservation or over-representation of the potential G-quadruplex sequences (PQS) in genomes are usually considered as a sign of the functional relevance of these sequences. However, uneven base distribution (GC-content) along the genome may along the genome may result in seeming abundance of PQSs over average in the genome. Apart from this, a number of other conserved functional signals that are encoded in the GC-rich genomic regions may inadvertently result in emergence of G-quadruplex compatible sequences. Here, we analyze the genomes of archaea focusing our search to repetitive PQS (rPQS) motifs within each organism. The probability of occurrence of several identical PQSs within a relatively short archaeal genome is low and, thus, the structure and genomic location of such rPQSs may become a direct indication of their functionality. We have found that the majority of the genomes of Methanomicrobiaceae family of archaea contained multiple copies of the interspersed highly similar PQSs. Short oligonucleotides corresponding to the rPQS formed the G-quadruplex (G4) structure in presence of potassium ions as demonstrated by circular dichroism (CD) and enzymatic probing. However, further analysis of the genomic context for the rPQS revealed a 10-12 nt cytosine-rich track adjacent to 3'-end of each rPQS. Synthetic DNA fragments that included the C-rich track tended to fold into alternative structures such as hairpin structure and antiparallel triplex that were in equilibrium with G4 structure depending on the presence of potassium ions in solution. Structural properties of the found repetitive sequences, their location in the genomes of archaea, and possible functions are discussed.

Keywords: DNA; G-quadruplex; Methanomicrobiaceae; archaea; circular dichroism; nuclease probing.