Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities

Brief Bioinform. 2023 Sep 22;24(6):bbad333. doi: 10.1093/bib/bbad333.

Abstract

In silico design of single guide RNA (sgRNA) plays a critical role in clustered regularly interspaced, short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9) system. Continuous efforts are aimed at improving sgRNA design with efficient on-target activity and reduced off-target mutations. In the last 5 years, an increasing number of deep learning-based methods have achieved breakthrough performance in predicting sgRNA on- and off-target activities. Nevertheless, it is worthwhile to systematically evaluate these methods for their predictive abilities. In this review, we conducted a systematic survey on the progress in prediction of on- and off-target editing. We investigated the performances of 10 mainstream deep learning-based on-target predictors using nine public datasets with different sample sizes. We found that in most scenarios, these methods showed superior predictive power on large- and medium-scale datasets than on small-scale datasets. In addition, we performed unbiased experiments to provide in-depth comparison of eight representative approaches for off-target prediction on 12 publicly available datasets with various imbalanced ratios of positive/negative samples. Most methods showed excellent performance on balanced datasets but have much room for improvement on moderate- and severe-imbalanced datasets. This study provides comprehensive perspectives on CRISPR/Cas9 sgRNA on- and off-target activity prediction and improvement for method development.

Keywords: CRISPR/Cas9; deep learning; off-target; on-target; sgRNA.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking
  • CRISPR-Cas Systems*
  • Deep Learning*
  • Gene Editing / methods
  • RNA, Guide, CRISPR-Cas Systems

Substances

  • RNA, Guide, CRISPR-Cas Systems