Complete deconvolution of DNA methylation signals from complex tissues: a geometric approach

Bioinformatics. 2021 May 23;37(8):1052-1059. doi: 10.1093/bioinformatics/btaa930.

Abstract

Motivation: It is a common practice in epigenetics research to profile DNA methylation on tissue samples, which is usually a mixture of different cell types. To properly account for the mixture, estimating cell compositions has been recognized as an important first step. Many methods were developed for quantifying cell compositions from DNA methylation data, but they mostly have limited applications due to lack of reference or prior information.

Results: We develop Tsisal, a novel complete deconvolution method which accurately estimate cell compositions from DNA methylation data without any prior knowledge of cell types or their proportions. Tsisal is a full pipeline to estimate number of cell types, cell compositions and identify cell-type-specific CpG sites. It can also assign cell type labels when (full or part of) reference panel is available. Extensive simulation studies and analyses of seven real datasets demonstrate the favorable performance of our proposed method compared with existing deconvolution methods serving similar purpose.

Availability and implementation: The proposed method Tsisal is implemented as part of the R/Bioconductor package TOAST at https://bioconductor.org/packages/TOAST.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • DNA Methylation*
  • Epigenomics
  • Research Design
  • Software*