AcaFinder: Genome Mining for Anti-CRISPR-Associated Genes

mSystems. 2022 Dec 20;7(6):e0081722. doi: 10.1128/msystems.00817-22. Epub 2022 Nov 22.

Abstract

Anti-CRISPR (Acr) proteins are encoded by (pro)viruses to inhibit their host's CRISPR-Cas systems. Genes encoding Acr and Aca (Acr associated) proteins often colocalize to form acr-aca operons. Here, we present AcaFinder as the first Aca genome mining tool. AcaFinder can (i) predict Acas and their associated acr-aca operons using guilt-by-association (GBA); (ii) identify homologs of known Acas using an HMM (Hidden Markov model) database; (iii) take input genomes for potential prophages, CRISPR-Cas systems, and self-targeting spacers (STSs); and (iv) provide a standalone program (https://github.com/boweny920/AcaFinder) and a web server (http://aca.unl.edu/Aca). AcaFinder was applied to mining over 16,000 prokaryotic and 142,000 gut phage genomes. After a multistep filtering, 36 high-confident new Aca families were identified, which is three times that of the 12 known Aca families. Seven new Aca families were from major human gut bacteria (Bacteroidota, Actinobacteria, and Fusobacteria) and their phages, while most known Aca families were from Proteobacteria and Firmicutes. A complex association network between Acrs and Acas was revealed by analyzing their operonic colocalizations. It appears very common in evolution that the same aca genes can recombine with different acr genes and vice versa to form diverse acr-aca operon combinations. IMPORTANCE At least four bioinformatics programs have been published for genome mining of Acrs since 2020. In contrast, no bioinformatics tools are available for automated Aca discovery. As the self-transcriptional repressor of acr-aca operons, Aca can be viewed as anti-anti-CRISPRs, with great potential in the improvement of CRISPR-Cas technology. Although all the 12 known Aca proteins contain a conserved helix-turn-helix (HTH) domain, not all HTH-containing proteins are Acas. However, HTH-containing proteins with adjacent Acr homologs encoded in the same genetic operon are likely Aca proteins. AcaFinder implements this guilt-by-association idea and the idea of using HMMs of known Acas for homologs into one software package. Applying AcaFinder in screening prokaryotic and gut phage genomes reveals a complex acr-aca operonic colocalization network between different families of Acrs and Acas.

Keywords: CRISPR-Cas; anti-CRISPR; bacteriophage; bioinformatics; helix turn helix.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / genetics
  • Bacteriophages* / genetics
  • CRISPR-Cas Systems*
  • Humans
  • Operon
  • Prophages / genetics