PhylOligo: a package to identify contaminant or untargeted organism sequences in genome assemblies

Bioinformatics. 2017 Oct 15;33(20):3283-3285. doi: 10.1093/bioinformatics/btx396.

Abstract

Motivation: Genome sequencing projects sometimes uncover more organisms than expected, especially for complex and/or non-model organisms. It is therefore useful to develop software to identify mix of organisms from genome sequence assemblies.

Results: Here we present PhylOligo, a new package including tools to explore, identify and extract organism-specific sequences in a genome assembly using the analysis of their DNA compositional characteristics.

Availability and implementation: The tools are written in Python3 and R under the GPLv3 Licence and can be found at https://github.com/itsmeludo/Phyloligo/.

Contact: ludovic.mallet@inra.fr.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Bacteria / genetics
  • Eukaryota / genetics
  • Genomics / methods*
  • Sequence Analysis, DNA / methods*
  • Software*