Exploring chromatin conformation and gene co-expression through graph embedding

Bioinformatics. 2020 Dec 30;36(Suppl_2):i700-i708. doi: 10.1093/bioinformatics/btaa803.

Abstract

Motivation: The relationship between gene co-expression and chromatin conformation is of great biological interest. Thanks to high-throughput chromosome conformation capture technologies (Hi-C), researchers are gaining insights on the tri-dimensional organization of the genome. Given the high complexity of Hi-C data and the difficult definition of gene co-expression networks, the development of proper computational tools to investigate such relationship is rapidly gaining the interest of researchers. One of the most fascinating questions in this context is how chromatin topology correlates with gene co-expression and which physical interaction patterns are most predictive of co-expression relationships.

Results: To address these questions, we developed a computational framework for the prediction of co-expression networks from chromatin conformation data. We first define a gene chromatin interaction network where each gene is associated to its physical interaction profile; then, we apply two graph embedding techniques to extract a low-dimensional vector representation of each gene from the interaction network; finally, we train a classifier on gene embedding pairs to predict if they are co-expressed. Both graph embedding techniques outperform previous methods based on manually designed topological features, highlighting the need for more advanced strategies to encode chromatin information. We also establish that the most recent technique, based on random walks, is superior. Overall, our results demonstrate that chromatin conformation and gene regulation share a non-linear relationship and that gene topological embeddings encode relevant information, which could be used also for downstream analysis.

Availability and implementation: The source code for the analysis is available at: https://github.com/marcovarrone/gene-expression-chromatin.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin* / genetics
  • Chromosomes*
  • Genome
  • Molecular Conformation
  • Software

Substances

  • Chromatin