Interpretable neural architecture search and transfer learning for understanding CRISPR/Cas9 off-target enzymatic reactions

Zijun Zhang; Adam R Lamson; Michael Shelley; Olga Troyanskaya

Interpretable neural architecture search and transfer learning for understanding CRISPR/Cas9 off-target enzymatic reactions

ArXiv [Preprint]. 2023 Sep 29:arXiv:2305.11917v2.

Authors

Zijun Zhang¹, Adam R Lamson², Michael Shelley^{2

3}, Olga Troyanskaya^{2

4}

Affiliations

¹ Division of Artificial Intelligence in Medicine, Cedars-Sinai Medical Center, 116 N. Robertson Blvd, Los Angeles, 90048, CA, USA.
² Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York City, 10010, NY, USA.
³ Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York City, 10012, NY, USA.
⁴ Lewis Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory South Drive, Princeton, 08544, NJ, USA.

PMID: 37808087
PMCID: PMC10557798

Abstract

Finely-tuned enzymatic pathways control cellular processes, and their dysregulation can lead to disease. Creating predictive and interpretable models for these pathways is challenging because of the complexity of the pathways and of the cellular and genomic contexts. Here we introduce Elektrum, a deep learning framework which addresses these challenges with data-driven and biophysically interpretable models for determining the kinetics of biochemical systems. First, it uses in vitro kinetic assays to rapidly hypothesize an ensemble of high-quality Kinetically Interpretable Neural Networks (KINNs) that predict reaction rates. It then employs a novel transfer learning step, where the KINNs are inserted as intermediary layers into deeper convolutional neural networks, fine-tuning the predictions for reaction-dependent in vivo outcomes. Elektrum makes effective use of the limited, but clean in vitro data and the complex, yet plentiful in vivo data that captures cellular context. We apply Elektrum to predict CRISPR-Cas9 off-target editing probabilities and demonstrate that Elektrum achieves state-of-the-art performance, regularizes neural network architectures, and maintains physical interpretability.

Keywords: AutoML; Interpretable neural networks; genome editing; neural architecture search; transfer learning.

Publication types

Preprint

Grants and funding

R01 GM071966/GM/NIGMS NIH HHS/United States