Compare_Genomes: A Comparative Genomics Workflow to Streamline the Analysis of Evolutionary Divergence Across Eukaryotic Genomes

Curr Protoc. 2023 Aug;3(8):e876. doi: 10.1002/cpz1.876.

Abstract

The dawn of cost-effective genome assembly is enabling deep comparative genomics to address fundamental evolutionary questions by comparing the genomes of multiple species. However, comparative genomics analyses frequently deploy multiple, often purpose-built frameworks, limiting their transferability and replicability. Here, we present compare_genomes, a transferable and extensible comparative genomics workflow package we developed that streamlines the identification of orthologous families within and across eukaryotic genomes and tests for the presence of several mechanisms of evolution (gene family expansion or contraction and substitution rates within protein-coding sequences). The workflow is available for Linux, written as a Nextflow workflow that calls established genomics and phylogenetics tools to streamline the analysis and visualization of eukaryotic genome divergence. This workflow is freely available at https://github.com/jeffersonfparil/compare_genomes, distributed under the GNU General Public License version 3 (GPLv3). © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Comparative genomics with Nextflow and Conda.

Keywords: bioinformatics; comparative genomics; data science; eukaryotic genomes; workflow management.

MeSH terms

  • Biological Evolution
  • Eukaryota* / classification
  • Eukaryota* / genetics
  • Genomics* / methods
  • Software*
  • Workflow