peaksat: an R package for ChIP-seq peak saturation analysis

BMC Genomics. 2023 Jan 25;24(1):43. doi: 10.1186/s12864-023-09109-7.

Abstract

Background: Epigenomic profiling assays such as ChIP-seq have been widely used to map the genome-wide enrichment profiles of chromatin-associated proteins and posttranslational histone modifications. Sequencing depth is a key parameter in experimental design and quality control. However, due to variable sequencing depth requirements across experimental conditions, it can be challenging to determine optimal sequencing depth, particularly for projects involving multiple targets or cell types.

Results: We developed the peaksat R package to provide target read depth estimates for epigenomic experiments based on the analysis of peak saturation curves. We applied peaksat to establish the distinctive read depth requirements for ChIP-seq studies of histone modifications in different cell lines. Using peaksat, we were able to estimate the target read depth required per library to obtain high-quality peak calls for downstream analysis. In addition, peaksat was applied to other sequence-enrichment methods including CUT&RUN and ATAC-seq.

Conclusion: peaksat addresses a need for researchers to make informed decisions about whether their sequencing data has been generated to an adequate depth and subsequently sufficient meaningful peaks, and failing that, how many more reads would be required per library. peaksat is applicable to other sequence-based methods that include calling peaks in their analysis.

Keywords: ChIP-Seq; Peak saturation; Read depth estimate.

MeSH terms

  • Chromatin Immunoprecipitation Sequencing* / methods
  • Gene Library
  • High-Throughput Nucleotide Sequencing*
  • Sequence Analysis, DNA / methods