Machine-learning approach expands the repertoire of anti-CRISPR protein families

Nat Commun. 2020 Jul 29;11(1):3784. doi: 10.1038/s41467-020-17652-0.

Abstract

The CRISPR-Cas are adaptive bacterial and archaeal immunity systems that have been harnessed for the development of powerful genome editing and engineering tools. In the incessant host-parasite arms race, viruses evolved multiple anti-defense mechanisms including diverse anti-CRISPR proteins (Acrs) that specifically inhibit CRISPR-Cas and therefore have enormous potential for application as modulators of genome editing tools. Most Acrs are small and highly variable proteins which makes their bioinformatic prediction a formidable task. We present a machine-learning approach for comprehensive Acr prediction. The model shows high predictive power when tested against an unseen test set and was employed to predict 2,500 candidate Acr families. Experimental validation of top candidates revealed two unknown Acrs (AcrIC9, IC10) and three other top candidates were coincidentally identified and found to possess anti-CRISPR activity. These results substantially expand the repertoire of predicted Acrs and provide a resource for experimental Acr discovery.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Archaea / genetics
  • Archaea / virology
  • Bacteria / genetics
  • Bacteria / virology
  • Bacteriophages / genetics*
  • CRISPR-Associated Protein 9 / antagonists & inhibitors*
  • CRISPR-Associated Protein 9 / genetics
  • CRISPR-Cas Systems / genetics
  • Computational Biology / methods
  • Datasets as Topic
  • Gene Editing / methods
  • Host-Parasite Interactions / genetics
  • Machine Learning*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Viral Proteins / genetics*

Substances

  • Viral Proteins
  • CRISPR-Associated Protein 9