Methrix: an R/Bioconductor package for systematic aggregation and analysis of bisulfite sequencing data

Bioinformatics. 2021 Apr 1;36(22-23):5524-5525. doi: 10.1093/bioinformatics/btaa1048.

Abstract

Motivation: Whole-genome bisulfite sequencing (WGBS) measures DNA methylation at base pair resolution resulting in large bedGraph like coverage files. Current options for processing such files are hindered by discrepancies in file format specification, speed, and memory requirements.

Results: We developed methrix, an R package, which provides a toolset for systematic analysis of large datasets. Core functionality of the package includes a comprehensive bedGraph or similar tab-separated text file reader-which summarizes methylation calls based on annotated reference indices, infers and collapses strands and handles uncovered reference CpG sites while facilitating a flexible input file format specification. Additional optimized functions for quality control filtering, subsetting and visualization allow user-friendly and effective processing of WGBS results. Easy integration with tools for differentially methylated region (DMR) calling and annotation further eases the analysis of genome-wide methylation data. Overall, methrix enriches established WGBS workflows by bringing together computational efficiency and versatile functionality.

Availability and implementation: Methrix is implemented as an R package, made available under MIT license at https://github.com/CompEpigen/methrix and can be installed from the Bioconductor repository.

Supplementary information: Supplementary data are available at Bioinformatics online.