CTCF: an R/bioconductor data package of human and mouse CTCF binding sites

Bioinform Adv. 2022 Dec 16;2(1):vbac097. doi: 10.1093/bioadv/vbac097. eCollection 2022.

Abstract

Summary: CTCF (CCCTC-binding factor) is an 11-zinc-finger DNA binding protein which regulates much of the eukaryotic genome's 3D structure and function. The diversity of CTCF binding motifs has led to a fragmented landscape of CTCF binding data. We collected position weight matrices of CTCF binding motifs and defined strand-oriented CTCF binding sites in the human and mouse genomes, including the recent Telomere to Telomere and mm39 assemblies. We included selected experimentally determined and predicted CTCF binding sites, such as CTCF-bound cis-regulatory elements from SCREEN ENCODE. We recommend filtering strategies for CTCF binding motifs and demonstrate that liftOver is a viable alternative to convert CTCF coordinates between assemblies. Our comprehensive data resource and usage recommendations can serve to harmonize and strengthen the reproducibility of genomic studies utilizing CTCF binding data.

Availability and implementation: https://bioconductor.org/packages/CTCF. Companion website: https://dozmorovlab.github.io/CTCF/; Code to reproduce the analyses: https://github.com/dozmorovlab/CTCF.dev.

Supplementary information: Supplementary data are available at Bioinformatics Advances online.