Comparative Analysis and Phylogenetic Insights of Cas14-Homology Proteins in Bacteria and Archaea

Genes (Basel). 2023 Oct 6;14(10):1911. doi: 10.3390/genes14101911.

Abstract

Type-V-F Cas12f proteins, also known as Cas14, have drawn significant interest within the diverse CRISPR-Cas nucleases due to their compact size. This study involves analyzing and comparing Cas14-homology proteins in prokaryotic genomes through mining, sequence comparisons, a phylogenetic analysis, and an array/repeat analysis. In our analysis, we identified and mined a total of 93 Cas14-homology proteins that ranged in size from 344 aa to 843 aa. The majority of the Cas14-homology proteins discovered in this analysis were found within the Firmicutes group, which contained 37 species, representing 42% of all the Cas14-homology proteins identified. In archaea, the DPANN group had the highest number of species containing Cas14-homology proteins, a total of three species. The phylogenetic analysis results demonstrate the division of Cas14-homology proteins into three clades: Cas14-A, Cas14-B, and Cas14-U. Extensive similarity was observed at the C-terminal end (CTD) through a domain comparison of the three clades, suggesting a potentially shared mechanism of action due to the presence of cutting domains in that region. Additionally, a sequence similarity analysis of all the identified Cas14 sequences indicated a low level of similarity (18%) between the protein variants. The analysis of repeats/arrays in the extended nucleotide sequences of the identified Cas14-homology proteins highlighted that 44 out of the total mined proteins possessed CRISPR-associated repeats, with 20 of them being specific to Cas14. Our study contributes to the increased understanding of Cas14 proteins across prokaryotic genomes. These homologous proteins have the potential for future applications in the mining and engineering of Cas14 proteins.

Keywords: Cas proteins; Cas14; Cas14 mining; Cas14-homology protein; phylogenetics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea* / genetics
  • Bacteria / genetics
  • CRISPR-Associated Proteins*
  • Phylogeny

Substances

  • CRISPR-Associated Proteins

Grants and funding

This research was funded with grants from the National Natural Science Foundation of China (32271508 and 31671313) and the High-end Talent Support Program of Yangzhou University to Chengyi Song.