Classifying the Unclassified: A Phage Classification Method

Viruses. 2019 Feb 24;11(2):195. doi: 10.3390/v11020195.

Abstract

This work reports the method ClassiPhage to classify phage genomes using sequence derived taxonomic features. ClassiPhage uses a set of phage specific Hidden Markov Models (HMMs) generated from clusters of related proteins. The method was validated on all publicly available genomes of phages that are known to infect Vibrionaceae. The phages belong to the well-described phage families of Myoviridae, Podoviridae, Siphoviridae, and Inoviridae. The achieved classification is consistent with the assignments of the International Committee on Taxonomy of Viruses (ICTV), all tested phages were assigned to the corresponding group of the ICTV-database. In addition, 44 out of 58 genomes of Vibrio phages not yet classified could be assigned to a phage family. The remaining 14 genomes may represent phages of new families or subfamilies. Comparative genomics indicates that the ability of the approach to identify and classify phages is correlated to the conserved genomic organization. ClassiPhage classifies phages exclusively based on genome sequence data and can be applied on distinct phage genomes as well as on prophage regions within host genomes. Possible applications include (a) classifying phages from assembled metagenomes; and (b) the identification and classification of integrated prophages and the splitting of phage families into subfamilies.

Keywords: Hidden Markov Models; Inoviridae; Keywords; Myoviridae; Podoviridae; Siphoviridae; Vibrionaceae; classification; phages; protein coding sequences; vibriophages.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteriophage Typing*
  • Bacteriophages / classification*
  • Genome, Viral*
  • Genomics
  • Lysogeny
  • Markov Chains
  • Metagenome
  • Phylogeny*
  • Podoviridae / classification
  • Prophages / classification
  • Siphoviridae / classification
  • Virus Integration