CoreDetector: a flexible and efficient program for core-genome alignment of evolutionary diverse genomes

Bioinformatics. 2023 Nov 1;39(11):btad628. doi: 10.1093/bioinformatics/btad628.

Abstract

Motivation: Whole genome alignment of eukaryote species remains an important method for the determination of sequence and structural variations and can also be used to ascertain the representative non-redundant core-genome sequence of a population. Many whole genome alignment tools were first developed for the more mature analysis of prokaryote species with few current tools containing the functionality to process larger genomes of eukaryotes as well as genomes of more divergent species. In addition, the functionality of these tools becomes computationally prohibitive due to the significant compute resources needed to handle larger genomes.

Results: In this research, we present CoreDetector, an easy-to-use general-purpose program that can align the core-genome sequences for a range of genome sizes and divergence levels. To illustrate the flexibility of CoreDetector, we conducted alignments of a large set of closely related fungal pathogen and hexaploid wheat cultivar genomes as well as more divergent fly and rodent species genomes. In all cases, compared to existing multiple genome alignment tools, CoreDetector exhibited improved flexibility, efficiency, and competitive accuracy in tested cases.

Availability and implementation: CoreDetector was developed in the cross platform, and easily deployable, Java language. A packaged pipeline is readily executable in a bash terminal without any external need for Perl or Python environments. Installation, example data, and usage instructions for CoreDetector are freely available from https://github.com/mfruzan/CoreDetector.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genome
  • Genomics* / methods
  • Sequence Alignment
  • Software*