Synteny Identifies Reliable Orthologs for Phylogenomics and Comparative Genomics of the Brassicaceae

Genome Biol Evol. 2023 Mar 3;15(3):evad034. doi: 10.1093/gbe/evad034.

Abstract

Large genomic data sets are becoming the new normal in phylogenetic research, but the identification of true orthologous genes and the exclusion of problematic paralogs is still challenging when applying commonly used sequencing methods such as target enrichment. Here, we compared conventional ortholog detection using OrthoFinder with ortholog detection through genomic synteny in a data set of 11 representative diploid Brassicaceae whole-genome sequences spanning the entire phylogenetic space. Then, we evaluated the resulting gene sets regarding gene number, functional annotation, and gene and species tree resolution. Finally, we used the syntenic gene sets for comparative genomics and ancestral genome analysis. The use of synteny resulted in considerably more orthologs and also allowed us to reliably identify paralogs. Surprisingly, we did not detect notable differences between species trees reconstructed from syntenic orthologs when compared with other gene sets, including the Angiosperms353 set and a Brassicaceae-specific target enrichment gene set. However, the synteny data set comprised a multitude of gene functions, strongly suggesting that this method of marker selection for phylogenomics is suitable for studies that value downstream gene function analysis, gene interaction, and network studies. Finally, we present the first ancestral genome reconstruction for the Core Brassicaceae which predating the Brassicaceae lineage diversification ∼25 million years ago.

Keywords: Angiosperms353; OrthoFinder; ancestral genome; ortholog; paralog; phylogenomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brassicaceae* / genetics
  • Genome
  • Genomics / methods
  • Phylogeny
  • Synteny