Phylobook: a tool for display, clade annotation and extraction of sequences from molecular phylogenies

Biotechniques. 2024 May 3. doi: 10.2144/btn-2023-0056. Online ahead of print.

Abstract

As the volume of sequence data from variable pathogens increases, means of analyzing, annotating and extracting specific taxa for study becomes more difficult. To meet these challenges for datasets with hundreds to thousands of taxa, 'Phylobook' was developed. Starting with a sequence alignment file, Phylobook generates and displays phylogenetic trees adjacent to highlighter plots showing the position of mutations, and allows the user to identify lineages and recombinants, annotate and export selected subsets of sequences for downstream analysis. Accurate lineage assignment, which is difficult to automate, is aided using annotations created by different clustering methods. Phylobook provides web-based display combined with automated clustering and manual editing to allow for expert assessment and correction of lineage assignments and extraction for downstream analysis.

Keywords: highlighter plot; lineage assignment; phylogeny; sequence annotation; sequence extraction.