Plastaumatic: Automating plastome assembly and annotation

Front Plant Sci. 2022 Nov 3:13:1011948. doi: 10.3389/fpls.2022.1011948. eCollection 2022.

Abstract

Plastome sequence data is most often extracted from plant whole genome sequencing data and need to be assembled and annotated separately from the nuclear genome sequence. In projects comprising multiple genomes, it is labour intense to individually process the plastomes as it requires many steps and software. This study developed Plastaumatic - an automated pipeline for both assembly and annotation of plastomes, with the scope of the researcher being able to load whole genome sequence data with minimal manual input, and therefore a faster runtime. The main structure of the current automated pipeline includes trimming of adaptor and low-quality sequences using fastp, de novo plastome assembly using NOVOPlasty, standardization and quality checking of the assembled genomes through a custom script utilizing BLAST+ and SAMtools, annotation of the assembled genomes using AnnoPlast, and finally generating the required files for NCBI GenBank submissions. The pipeline is demonstrated with 12 potato accessions and three soybean accessions.

Keywords: chloroplast genome; organellar genome; plant; plastome; sequence assembly.