CVTree: A Parallel Alignment-free Phylogeny and Taxonomy Tool Based on Composition Vectors of Genomes

Genomics Proteomics Bioinformatics. 2021 Aug;19(4):662-667. doi: 10.1016/j.gpb.2021.03.006. Epub 2021 Jun 10.

Abstract

Composition Vector Tree (CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and fungi based on the whole genomes, as well as chloroplast genomes, mitochondrial genomes, and metagenomes. Here we presented the standalone software for the CVTree algorithm. In the software, an extensible parallel workflow for the CVTree algorithm was designed. Based on the workflow, new alignment-free methods were also implemented. And by examining the phylogeny and taxonomy of 13,903 prokaryotes based on 16S rRNA sequences, we showed that CVTree software is an efficient and effective tool for studying phylogeny and taxonomy based on genome sequences. The code of CVTree software can be available at https://github.com/ghzuo/cvtree.

Keywords: Alignment-free; CVTree; Composition vector; Dissimilarity matrix; Phylogeny; Taxonomy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genome*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics
  • Software*

Substances

  • RNA, Ribosomal, 16S