EpiMOLAS: an intuitive web-based framework for genome-wide DNA methylation analysis

BMC Genomics. 2020 Apr 2;21(Suppl 3):163. doi: 10.1186/s12864-019-6404-8.

Abstract

Background: DNA methylation is a crucial epigenomic mechanism in various biological processes. Using whole-genome bisulfite sequencing (WGBS) technology, methylated cytosine sites can be revealed at the single nucleotide level. However, the WGBS data analysis process is usually complicated and challenging.

Results: To alleviate the associated difficulties, we integrated the WGBS data processing steps and downstream analysis into a two-phase approach. First, we set up the required tools in Galaxy and developed workflows to calculate the methylation level from raw WGBS data and generate a methylation status summary, the mtable. This computation environment is wrapped into the Docker container image DocMethyl, which allows users to rapidly deploy an executable environment without tedious software installation and library dependency problems. Next, the mtable files were uploaded to the web server EpiMOLAS_web to link with the gene annotation databases that enable rapid data retrieval and analyses.

Conclusion: To our knowledge, the EpiMOLAS framework, consisting of DocMethyl and EpiMOLAS_web, is the first approach to include containerization technology and a web-based system for WGBS data analysis from raw data processing to downstream analysis. EpiMOLAS will help users cope with their WGBS data and also conduct reproducible analyses of publicly available data, thereby gaining insights into the mechanisms underlying complex biological phenomenon. The Galaxy Docker image DocMethyl is available at https://hub.docker.com/r/lsbnb/docmethyl/. EpiMOLAS_web is publicly accessible at http://symbiosis.iis.sinica.edu.tw/epimolas/.

Keywords: DNA methylation data analysis; Docker; Galaxy platform; WGBS pipeline.

MeSH terms

  • Computational Biology / methods*
  • CpG Islands / genetics
  • DNA Methylation / genetics*
  • Genome, Human / genetics*
  • Humans
  • Internet
  • Software
  • Whole Genome Sequencing / methods*