Phage hunters: Computational strategies for finding phages in large-scale 'omics datasets

Virus Res. 2018 Jan 15:244:110-115. doi: 10.1016/j.virusres.2017.10.019. Epub 2017 Nov 1.

Abstract

A plethora of tools exist for identifying phage sequences in bacterial genomes, single cell amplified genomes, and host-associated and environmental metagenomes. Yet because the genetics of phages and their hosts are closely intertwined, distinguishing viral from bacterial signal remains an ongoing challenge. Further the size, quantity and fragmentary nature of modern 'omics datasets ushers in a new set of computational challenges. Here, we detail the promises and pitfalls of using currently available gene-centric or k-mer based tools for identifying prophage sequences in genomes and prophage and viral contigs in metagenomes. Each of these methods offers a unique piece of the puzzle to elucidating the intriguing signatures of phage-host coevolution.

Keywords: Bioinformatics; Computational biology; Metagenomics; Phage; Prophage; Virus-host coevolution.

Publication types

  • Review
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Bacteria / genetics
  • Bacteria / virology
  • Bacteriophages / genetics*
  • Bacteriophages / isolation & purification
  • Biological Coevolution
  • Computational Biology / methods*
  • Databases, Genetic
  • Datasets as Topic
  • Genome, Bacterial*
  • Genome, Viral*
  • Metagenomics / methods*
  • Prophages / genetics*
  • Prophages / isolation & purification
  • Sequence Analysis, DNA
  • Sequence Analysis, RNA