PhyloSuite: An integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies

Mol Ecol Resour. 2020 Jan;20(1):348-355. doi: 10.1111/1755-0998.13096. Epub 2019 Nov 6.

Abstract

Multigene and genomic data sets have become commonplace in the field of phylogenetics, but many existing tools are not designed for such data sets, which often makes the analysis time-consuming and tedious. Here, we present PhyloSuite, a (cross-platform, open-source, stand-alone Python graphical user interface) user-friendly workflow desktop platform dedicated to streamlining molecular sequence data management and evolutionary phylogenetics studies. It uses a plugin-based system that integrates several phylogenetic and bioinformatic tools, thereby streamlining the entire procedure, from data acquisition to phylogenetic tree annotation (in combination with iTOL). It has the following features: (a) point-and-click and drag-and-drop graphical user interface; (b) a workplace to manage and organize molecular sequence data and results of analyses; (c) GenBank entry extraction and comparative statistics; and (d) a phylogenetic workflow with batch processing capability, comprising sequence alignment (mafft and macse), alignment optimization (trimAl, HmmCleaner and Gblocks), data set concatenation, best partitioning scheme and best evolutionary model selection (PartitionFinder and modelfinder), and phylogenetic inference (MrBayes and iq-tree). PhyloSuite is designed for both beginners and experienced researchers, allowing the former to quick-start their way into phylogenetic analysis, and the latter to conduct, store and manage their work in a streamlined way, and spend more time investigating scientific questions instead of wasting it on transferring files from one software program to another.

Keywords: GenBank extraction; batch capacity; high efficiency; multicore operation; partition analysis; phylogenetic flowchart.

Publication types

  • Evaluation Study

MeSH terms

  • Computational Biology / instrumentation
  • Computational Biology / methods*
  • Data Management
  • Databases, Nucleic Acid
  • Molecular Sequence Data
  • Phylogeny
  • Software
  • Workflow