Discovering differential genome sequence activity with interpretable and efficient deep learning

PLoS Comput Biol. 2021 Aug 9;17(8):e1009282. doi: 10.1371/journal.pcbi.1009282. eCollection 2021 Aug.

Abstract

Discovering sequence features that differentially direct cells to alternate fates is key to understanding both cellular development and the consequences of disease related mutations. We introduce Expected Pattern Effect and Differential Expected Pattern Effect, two black-box methods that can interpret genome regulatory sequences for cell type-specific or condition specific patterns. We show that these methods identify relevant transcription factor motifs and spacings that are predictive of cell state-specific chromatin accessibility. Finally, we integrate these methods into framework that is readily accessible to non-experts and available for download as a binary or installed via PyPI or bioconda at https://cgs.csail.mit.edu/deepaccess-package/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Deep Learning*
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Neural Networks, Computer
  • Sequence Analysis, DNA / methods
  • Transcription Factors / metabolism

Substances

  • Transcription Factors