GAEP: a comprehensive genome assembly evaluating pipeline

J Genet Genomics. 2023 May 26:S1673-8527(23)00119-4. doi: 10.1016/j.jgg.2023.05.009. Online ahead of print.

Abstract

With the rapid development of sequencing technologies, especially the maturity of third-generation sequencing technologies, there has been a significant increase in the number and quality of published genome assemblies. The emergence of these high-quality genomes has raised higher requirements for genome evaluation. Although numerous computational methods have been developed to evaluate assembly quality from various perspectives, the selective use of these evaluation methods can be arbitrary and inconvenient for fairly comparing the assembly quality. To address this issue, we have developed the Genome Assembly Evaluating Pipeline (GAEP), which provides a comprehensive assessment pipeline for evaluating genome quality from multiple perspectives, including continuity, completeness, and correctness. Additionally, GAEP includes new functions for detecting misassemblies and evaluating the assembly redundancy, which performs well in our testing. GAEP is publicly available at https://github.com/zy-optimistic/GAEP under the GPL3.0 License. With GAEP, users can quickly obtain accurate and reliable evaluation results, facilitating the comparison and selection of high-quality genome assemblies.

Keywords: Assembly evaluation pipeline; Assembly metrics; Assembly quality; Assembly redundancy; Genome assembly; Misassembly breakpoint; Misassembly detection.