Whole genome analysis of CRISPR Cas9 sgRNA off-target homologies via an efficient computational algorithm

BMC Genomics. 2017 Nov 17;18(Suppl 9):826. doi: 10.1186/s12864-017-4225-1.

Abstract

Background: The beauty and power of the genome editing mechanism, CRISPR Cas9 endonuclease system, lies in the fact that it is RNA-programmable such that Cas9 can be guided to any genomic loci complementary to a 20-nt RNA, single guide RNA (sgRNA), to cleave double stranded DNA, allowing the introduction of wanted mutations. Unfortunately, it has been reported repeatedly that the sgRNA can also guide Cas9 to off-target sites where the DNA sequence is homologous to sgRNA.

Results: Using human genome and Streptococcus pyogenes Cas9 (SpCas9) as an example, this article mathematically analyzed the probabilities of off-target homologies of sgRNAs and discovered that for large genome size such as human genome, potential off-target homologies are inevitable for sgRNA selection. A highly efficient computationl algorithm was developed for whole genome sgRNA design and off-target homology searches. By means of a dynamically constructed sequence-indexed database and a simplified sequence alignment method, this algorithm achieves very high efficiency while guaranteeing the identification of all existing potential off-target homologies. Via this algorithm, 1,876,775 sgRNAs were designed for the 19,153 human mRNA genes and only two sgRNAs were found to be free of off-target homology.

Conclusions: By means of the novel and efficient sgRNA homology search algorithm introduced in this article, genome wide sgRNA design and off-target analysis were conducted and the results confirmed the mathematical analysis that for a sgRNA sequence, it is almost impossible to escape potential off-target homologies. Future innovations on the CRISPR Cas9 gene editing technology need to focus on how to eliminate the Cas9 off-target activity.

Keywords: Cas9; Computational algorithm; Crispr; Genome wide; Off-target homology; sgRNA.

MeSH terms

  • Algorithms*
  • CRISPR-Cas Systems*
  • Genome, Human
  • Genomics
  • Humans
  • RNA Editing
  • RNA, Guide, CRISPR-Cas Systems / genetics*
  • Sequence Analysis, RNA / methods*
  • Streptococcus pyogenes / genetics*

Substances

  • RNA, Guide, CRISPR-Cas Systems