A Simple Criterion for Inferring CRISPR Array Direction

Front Microbiol. 2019 Sep 4:10:2054. doi: 10.3389/fmicb.2019.02054. eCollection 2019.

Abstract

Inferring transcriptional direction (orientation) of the CRISPR array is essential for many applications, including systematically investigating non-canonical CRISPR/Cas functions. The standard method, CRISPRDirection (embedded within CRISPRCasFinder), fails to predict the orientation (ND predictions) for ∼37% of the classified CRISPR arrays (>2200 loci); this goes up to >70% for the II-B subtype where non-canonical functions were first experimentally discovered. Alternatively, Potential Orientation (also embedded within CRISPRCasFinder), has a much smaller frequency of ND predictions but might have significantly lower accuracy. We propose a novel simple criterion, where the CRISPR array direction is assigned according to the direction of its associated cas genes (Cas Orientation). We systematically assess the performance of the three methods (Cas Orientation, CRISPRDirection, and Potential Orientation) across all CRISPR/Cas subtypes, by a mutual crosscheck of their predictions, and by comparing them to the experimental dataset. Interestingly, CRISPRDirection agrees much better with Cas Orientation than with Potential Orientation, despite CRISPRDirection and Potential Orientation being mutually related - Potential Orientation corresponding to one of six (heterogeneous) predictors employed by CRISPRDirection - and being unrelated to Cas Orientation. We find that Cas Orientation has much higher accuracy compared to Potential Orientation and comparable accuracy to CRISPRDirection - while accurately assigning an orientation to ∼95% of the CRISPR arrays that are non-determined by CRISPRDirection. Cas Orientation is, at the same time, simple to employ, requiring only (routine for prokaryotes) the prediction of the associated protein coding gene direction.

Keywords: CRISPR array orientation; CRISPR/Cas; cas gene orientation; large-scale analysis; non-canonical functions.