DeepHiC: A generative adversarial network for enhancing Hi-C data resolution

PLoS Comput Biol. 2020 Feb 21;16(2):e1007287. doi: 10.1371/journal.pcbi.1007287. eCollection 2020 Feb.

Abstract

Hi-C is commonly used to study three-dimensional genome organization. However, due to the high sequencing cost and technical constraints, the resolution of most Hi-C datasets is coarse, resulting in a loss of information and biological interpretability. Here we develop DeepHiC, a generative adversarial network, to predict high-resolution Hi-C contact maps from low-coverage sequencing data. We demonstrated that DeepHiC is capable of reproducing high-resolution Hi-C data from as few as 1% downsampled reads. Empowered by adversarial training, our method can restore fine-grained details similar to those in high-resolution Hi-C matrices, boosting accuracy in chromatin loops identification and TADs detection, and outperforms the state-of-the-art methods in accuracy of prediction. Finally, application of DeepHiC to Hi-C data on mouse embryonic development can facilitate chromatin loop detection. We develop a web-based tool (DeepHiC, http://sysomics.com/deephic) that allows researchers to enhance their own Hi-C data with just a few clicks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin / chemistry
  • Datasets as Topic
  • Genome*
  • Models, Biological*
  • Sequence Analysis / methods

Substances

  • Chromatin

Grants and funding

This work was funded by rewards including the National Natural Science Foundation of China (No. 31801112), the Beijing Nova Program of Science and Technology (NO. Z191100001119064), URL: http://www.nsfc.gov.cn and https://mis.kw.beijing.gov.cn, to HC and the National Natural Science Foundation of China (No. 61873276), URL: http://www.nsfc.gov.cn, to XB. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.