What the Phage: a scalable workflow for the identification and analysis of phage sequences

Gigascience. 2022 Nov 18:11:giac110. doi: 10.1093/gigascience/giac110.

Abstract

Phages are among the most abundant and diverse biological entities on earth. Phage prediction from sequence data is a crucial first step to understanding their impact on the environment. A variety of bacteriophage prediction tools have been developed over the years. They differ in algorithmic approach, results, and ease of use. We, therefore, developed "What the Phage" (WtP), an easy-to-use and parallel multitool approach for phage prediction combined with an annotation and classification downstream strategy, thus supporting the user's decision-making process by summarizing the results of the different prediction tools in charts and tables. WtP is reproducible and scales to thousands of datasets through a workflow manager (Nextflow). WtP is freely available under a GPL-3.0 license (https://github.com/replikation/What_the_Phage).

Keywords: Docker; Nextflow; easy to use; multitool approach; phage prediction; scalable.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteriophages* / genetics
  • Workflow