A new approach to decode DNA methylome and genomic variants simultaneously from double strand bisulfite sequencing

Brief Bioinform. 2021 Nov 5;22(6):bbab201. doi: 10.1093/bib/bbab201.

Abstract

Genetic and epigenetic contributions to various diseases and biological processes have been well-recognized. However, simultaneous identification of single-nucleotide variants (SNVs) and DNA methylation levels from traditional bisulfite sequencing data is still challenging. Here, we develop double strand bisulfite sequencing (DSBS) for genome-wide accurate identification of SNVs and DNA methylation simultaneously at a single-base resolution by using one dataset. Locking Watson and Crick strand together by hairpin adapter followed by bisulfite treatment and massive parallel sequencing, DSBS simultaneously sequences the bisulfite-converted Watson and Crick strand in one paired-end read, eliminating the strand bias of bisulfite sequencing data. Mutual correction of read1 and read2 can estimate the amplification and sequencing errors, and enables our developed computational pipeline, DSBS Analyzer (https://github.com/tianguolangzi/DSBS), to accurately identify SNV and DNA methylation. Additionally, using DSBS, we provide a genome-wide hemimethylation landscape in the human cells, and reveal that the density of DNA hemimethylation sites in promoter region and CpG island is lower than that in other genomic regions. The cost-effective new approach, which decodes DNA methylome and genomic variants simultaneously, will facilitate more comprehensive studies on numerous diseases and biological processes driven by both genetic and epigenetic variations.

Keywords: CpG context; cytosine modification; epigenomic alteration; genomic mutation; population genomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • CpG Islands
  • DNA Methylation*
  • Epigenesis, Genetic
  • Epigenomics / methods*
  • Genetic Background
  • Genetics, Population
  • Genomics
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA*
  • Software*
  • Sulfites*
  • Whole Genome Sequencing

Substances

  • Sulfites
  • hydrogen sulfite