An in silico procedure for generating protein-mediated chromatin interaction data and comparison of significant interaction calling methods

PLoS One. 2024 Jan 17;19(1):e0287521. doi: 10.1371/journal.pone.0287521. eCollection 2024.

Abstract

The ability to simulate high-throughput data with high fidelity to real experimental data is fundamental for benchmarking methods used to detect true long-range chromatin interactions mediated by a specific protein. Yet, such tools are not currently available. To fill this gap, we develop an in silico experimental procedure, ChIA-Sim, which imitates the experimental procedures that produce real ChIA-PET, Hi-ChIP, or PLAC-seq data. We show the fidelity of ChIA-Sim to real data by using guiding characteristics of several real datasets to generate data using the simulation procedure. We also used ChIA-Sim data to demonstrate the use of our in silico procedure in benchmarking methods for significant interactions analysis by evaluating four methods for significant interaction calling (SIC). In particular, we assessed each method's performance in terms of correct identification of long-range interactions. We further analyzed four experimental datasets from publicly available databases and shew that the trend of the results are consistent with those seen in data generated from ChIA-Sim. This serves as additional evidence that ChIA-Sim closely resembles data produced from the experimental protocols it models after.

MeSH terms

  • Chromatin Immunoprecipitation Sequencing
  • Chromatin*
  • Chromosomes*
  • Computer Simulation
  • Sequence Analysis, DNA / methods

Substances

  • Chromatin