Unified Model to Predict gRNA Efficiency across Diverse Cell Lines and CRISPR-Cas9 Systems

J Chem Inf Model. 2023 Dec 11;63(23):7320-7329. doi: 10.1021/acs.jcim.3c01339. Epub 2023 Nov 20.

Abstract

Computationally predicting the efficiency of a guide RNA (gRNA) from its sequence is crucial to designing the CRISPR-Cas9 system. Currently, machine learning (ML)-based models are widely used for such predictions. However, these ML models often show performance imbalance when applied to multiple data sets from diverse sources, hindering the practical utilization of these tools. To address this issue, we propose a Michaelis-Menten theoretical framework that integrates information from multiple data sets. We demonstrate that the binding free energy can serve as a useful invariant that bridges the data from different experimental setups. Building upon this framework, we develop a new ML model called Uni-deepSG. This model exhibits broad applicability on 27 data sets with different cell types, Cas9 variants, and gRNA designs. Our work confirms the existence of a generalized model for predicting gRNA efficiency and lays the theoretical groundwork necessary to finalize such a model.

MeSH terms

  • CRISPR-Cas Systems*
  • Cell Line
  • Gene Editing*
  • Machine Learning
  • RNA, Guide, CRISPR-Cas Systems

Substances

  • RNA, Guide, CRISPR-Cas Systems