Strategies for analyzing bisulfite sequencing data

J Biotechnol. 2017 Nov 10:261:105-115. doi: 10.1016/j.jbiotec.2017.08.007. Epub 2017 Aug 16.

Abstract

DNA methylation is one of the main epigenetic modifications in the eukaryotic genome; it has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used for further classification of regions returned by segmentation and differential methylation methods. Finally, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and we discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis.

Keywords: Bisulfite-sequencing; Differential methylation; Galaxy; Methylation; Methylation segmentation.

Publication types

  • Review

MeSH terms

  • Computational Biology
  • DNA Methylation*
  • DNA* / analysis
  • DNA* / chemistry
  • DNA* / genetics
  • Sequence Analysis, DNA / methods*
  • Software*
  • Sulfites / chemistry*

Substances

  • Sulfites
  • DNA
  • hydrogen sulfite