DBSCAN-SWA: An Integrated Tool for Rapid Prophage Detection and Annotation

Front Genet. 2022 Apr 19:13:885048. doi: 10.3389/fgene.2022.885048. eCollection 2022.

Abstract

As an intracellular form of a bacteriophage in the bacterial host genome, a prophage usually integrates into bacterial DNA with high specificity and contributes to horizontal gene transfer (HGT). With the exponentially increasing number of microbial sequences uncovered in genomic or metagenomics studies, there is a massive demand for a tool that is capable of fast and accurate identification of prophages. Here, we introduce DBSCAN-SWA, a command line software tool developed to predict prophage regions in bacterial genomes. DBSCAN-SWA runs faster than any previous tools. Importantly, it has great detection power based on analysis using 184 manually curated prophages, with a recall of 85% compared with Phage_Finder (63%), VirSorter (74%), and PHASTER (82%) for (Multi-) FASTA sequences. Moreover, DBSCAN-SWA outperforms the existing standalone prophage prediction tools for high-throughput sequencing data based on the analysis of 19,989 contigs of 400 bacterial genomes collected from Human Microbiome Project (HMP) project. DBSCAN-SWA also provides user-friendly result visualizations including a circular prophage viewer and interactive DataTables. DBSCAN-SWA is implemented in Python3 and is available under an open source GPLv2 license from https://github.com/HIT-ImmunologyLab/DBSCAN-SWA/.

Keywords: density-based spatial clustering; phage; phage-host interaction; prophage; sliding window.