Churros: a Docker-based pipeline for large-scale epigenomic analysis

DNA Res. 2024 Feb 1;31(1):dsad026. doi: 10.1093/dnares/dsad026.

Abstract

The epigenome, which reflects the modifications on chromatin or DNA sequences, provides crucial insight into gene expression regulation and cellular activity. With the continuous accumulation of epigenomic datasets such as chromatin immunoprecipitation followed by sequencing (ChIP-seq) data, there is a great demand for a streamlined pipeline to consistently process them, especially for large-dataset comparisons involving hundreds of samples. Here, we present Churros, an end-to-end epigenomic analysis pipeline that is environmentally independent and optimized for handling large-scale data. We successfully demonstrated the effectiveness of Churros by analyzing large-scale ChIP-seq datasets with the hg38 or Telomere-to-Telomere (T2T) human reference genome. We found that applying T2T to the typical analysis workflow has important impacts on read mapping, quality checks, and peak calling. We also introduced a useful feature to study context-specific epigenomic landscapes. Churros will contribute a comprehensive and unified resource for analyzing large-scale epigenomic data.

Keywords: Docker; T2T genome; bioinformatics pipeline; epigenomics analysis; large-scale ChIP-seq.

MeSH terms

  • Chromatin / genetics
  • Chromatin Immunoprecipitation
  • Chromatin Immunoprecipitation Sequencing*
  • Epigenomics*
  • Gene Expression Regulation
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Sequence Analysis, DNA

Substances

  • Chromatin