PhySpeTree: an automated pipeline for reconstructing phylogenetic species trees

BMC Evol Biol. 2019 Dec 2;19(1):219. doi: 10.1186/s12862-019-1541-x.

Abstract

Background: Phylogenetic species trees are widely used in inferring evolutionary relationships. Existing software and algorithms mainly focus on phylogenetic inference. However, less attention has been paid to intermediate steps, such as processing extremely large sequences and preparing configure files to connect multiple software. When the species number is large, the intermediate steps become a bottleneck that may seriously affect the efficiency of tree building.

Results: Here, we present an easy-to-use pipeline named PhySpeTree to facilitate the reconstruction of species trees across bacterial, archaeal, and eukaryotic organisms. Users need only to input the abbreviations of species names; PhySpeTree prepares complex configure files for different software, then automatically downloads genomic data, cleans sequences, and builds trees. PhySpeTree allows users to perform critical steps such as sequence alignment and tree construction by adjusting advanced options. PhySpeTree provides two parallel pipelines based on concatenated highly conserved proteins and small subunit ribosomal RNA sequences, respectively. Accessory modules, such as those for inserting new species, generating visualization configurations, and combining trees, are distributed along with PhySpeTree.

Conclusions: Together with accessory modules, PhySpeTree significantly simplifies tree reconstruction. PhySpeTree is implemented in Python running on modern operating systems (Linux, macOS, and Windows). The source code is freely available with detailed documentation (https://github.com/yangfangs/physpetools).

Keywords: Automatic construction; Pipeline; Species tree.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Archaea / classification
  • Archaea / genetics
  • Bacteria / classification
  • Bacteria / genetics
  • Base Sequence
  • Biological Evolution
  • Eukaryota / classification
  • Eukaryota / genetics
  • Genomics
  • Phylogeny*
  • RNA, Ribosomal / genetics
  • Sequence Alignment
  • Software*

Substances

  • RNA, Ribosomal