The adapted Activity-By-Contact model for enhancer-gene assignment and its application to single-cell data

Bioinformatics. 2023 Feb 3;39(2):btad062. doi: 10.1093/bioinformatics/btad062.

Abstract

Motivation: Identifying regulatory regions in the genome is of great interest for understanding the epigenomic landscape in cells. One fundamental challenge in this context is to find the target genes whose expression is affected by the regulatory regions. A recent successful method is the Activity-By-Contact (ABC) model which scores enhancer-gene interactions based on enhancer activity and the contact frequency of an enhancer to its target gene. However, it describes regulatory interactions entirely from a gene's perspective, and does not account for all the candidate target genes of an enhancer. In addition, the ABC model requires two types of assays to measure enhancer activity, which limits the applicability. Moreover, there is neither implementation available that could allow for an integration with transcription factor (TF) binding information nor an efficient analysis of single-cell data.

Results: We demonstrate that the ABC score can yield a higher accuracy by adapting the enhancer activity according to the number of contacts the enhancer has to its candidate target genes and also by considering all annotated transcription start sites of a gene. Further, we show that the model is comparably accurate with only one assay to measure enhancer activity. We combined our generalized ABC model with TF binding information and illustrated an analysis of a single-cell ATAC-seq dataset of the human heart, where we were able to characterize cell type-specific regulatory interactions and predict gene expression based on TF affinities. All executed processing steps are incorporated into our new computational pipeline STARE.

Availability and implementation: The software is available at https://github.com/schulzlab/STARE.

Contact: marcel.schulz@em.uni-frankfurt.de.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Regulation*
  • Humans
  • Protein Binding
  • Regulatory Sequences, Nucleic Acid
  • Software
  • Transcription Factors* / metabolism

Substances

  • Transcription Factors