MethRaFo: MeDIP-seq methylation estimate using a Random Forest Regressor

Bioinformatics. 2017 Nov 1;33(21):3477-3479. doi: 10.1093/bioinformatics/btx449.

Abstract

Motivation: Profiling of genome wide DNA methylation is now routinely performed when studying development, cancer and several other biological processes. Although Whole genome Bisulfite Sequencing provides high-quality methylation measurements at the resolution of nucleotides, it is relatively costly and so several studies have used alternative methods for such profiling. One of the most widely used low cost alternatives is MeDIP-Seq. However, MeDIP-Seq is biased for CpG enriched regions and thus its results need to be corrected in order to determine accurate methylation levels.

Results: Here we present a method for correcting MeDIP-Seq results based on Random Forest regression. Applying the method to real data from several different tissues (brain, cortex, penis) we show that it achieves almost 4 fold decrease in run time while increasing accuracy by as much as 20% over prior methods developed for this task.

Availability and implementation: MethRaFo is freely available as a python package (with a R wrapper) at https://github.com/phoenixding/methrafo.

Contact: zivbj@cs.cmu.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • DNA Methylation*
  • Humans
  • Sequence Analysis, DNA / methods
  • Software*