Genome-wide mapping of N 4-methylcytosine at single-base resolution by APOBEC3A-mediated deamination sequencing

Chem Sci. 2022 Aug 11;13(34):9960-9972. doi: 10.1039/d2sc02446b. eCollection 2022 Aug 31.

Abstract

N 4-methylcytosine (4mC) is a natural DNA modification occurring in thermophiles and plays important roles in restriction-modification (R-M) systems in bacterial genomes. However, the precise location and sequence context of 4mC in the whole genome are limited. In this study, we developed an APOBEC3A-mediated deamination sequencing (4mC-AMD-seq) method for genome-wide mapping of 4mC at single-base resolution. In the 4mC-AMD-seq method, cytosine and 5-methylcytosine (5mC) are deaminated by APOBEC3A (A3A) protein to generate uracil and thymine, both of which are read as thymine in sequencing, while 4mC is resistant to deamination and therefore read as cytosine. Thus, the readouts of cytosines from sequencing could manifest the original 4mC sites in genomes. With the 4mC-AMD-seq method, we achieved the genome-wide mapping of 4mC in Deinococcus radiodurans (D. radiodurans). In addition, we confirmed that 4mC, but not 5mC, was the major modification in the D. radiodurans genome. We identified 1586 4mC sites in the genome of D. radiodurans, among which 564 sites were located in the CCGCGG motif. The average methylation levels in the CCGCGG motif and non-CCGCGG sequence were 70.0% and 22.8%, respectively. We envision that the 4mC-AMD-seq method will facilitate the investigation of 4mC functions, including the 4mC-involved R-M systems, in uncharacterized but potentially useful strains.