Fully interpretable deep learning model of transcriptional control

Yi Liu; Kenneth Barr; John Reinitz

doi:10.1093/bioinformatics/btaa506

Fully interpretable deep learning model of transcriptional control

Bioinformatics. 2020 Jul 1;36(Suppl_1):i499-i507. doi: 10.1093/bioinformatics/btaa506.

Authors

Yi Liu¹, Kenneth Barr², John Reinitz³

Affiliations

¹ Department of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA.
² Department of Human Genetics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA.
³ Departments of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA.

Abstract

Motivation: The universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent worksin the systems biology community to employDNNs to solve important problems in functional genomics and moleculargenetics. Typically, such investigations have taken a 'black box' approach in which the internal structure of themodel used is set purely by machine learning considerations with little consideration of representing the internalstructure of the biological system by the mathematical structure of the DNN. DNNs have not yet been applied to thedetailed modeling of transcriptional control in which mRNA production is controlled by the binding of specific transcriptionfactors to DNA, in part because such models are in part formulated in terms of specific chemical equationsthat appear different in form from those used in neural networks.

Results: In this paper, we give an example of a DNN whichcan model the detailed control of transcription in a precise and predictive manner. Its internal structure is fully interpretableand is faithful to underlying chemistry of transcription factor binding to DNA. We derive our DNN from asystems biology model that was not previously recognized as having a DNN structure. Although we apply our DNNto data from the early embryo of the fruit fly Drosophila, this system serves as a test bed for analysis of much larger datasets obtained by systems biology studies on a genomic scale. .

Availability and implementation: The implementation and data for the models used in this paper are in a zip file in the supplementary material.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Deep Learning*
Gene Expression Regulation
Genomics
Machine Learning
Neural Networks, Computer

Grants and funding

R01 OD010936/OD/NIH HHS/United States