Expression Pattern Similarities Support the Prediction of Orthologs Retaining Common Functions after Gene Duplication Events

Plant Physiol. 2016 Aug;171(4):2343-57. doi: 10.1104/pp.15.01207. Epub 2016 Jun 14.

Abstract

The identification of functionally equivalent, orthologous genes (functional orthologs) across genomes is necessary for accurate transfer of experimental knowledge from well-characterized organisms to others. This frequently relies on automated, coding sequence-based approaches such as OrthoMCL, Inparanoid, and KOG, which usually work well for one-to-one homologous states. However, this strategy does not reliably work for plants due to the occurrence of extensive gene/genome duplication. Frequently, for one query gene, multiple orthologous genes are predicted in the other genome, and it is not clear a priori from sequence comparison and similarity which one preserves the ancestral function. We have studied 11 organ-dependent and stress-induced gene expression patterns of 286 Arabidopsis lyrata duplicated gene groups and compared them with the respective Arabidopsis (Arabidopsis thaliana) genes to predict putative expressologs and nonexpressologs based on gene expression similarity. Promoter sequence divergence as an additional tool to substantiate functional orthology only partially overlapped with expressolog classification. By cloning eight A. lyrata homologs and complementing them in the respective four Arabidopsis loss-of-function mutants, we experimentally proved that predicted expressologs are indeed functional orthologs, while nonexpressologs or nonfunctionalized orthologs are not. Our study demonstrates that even a small set of gene expression data in addition to sequence homologies are instrumental in the assignment of functional orthologs in the presence of multiple orthologs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis / physiology
  • Arabidopsis Proteins / genetics
  • Arabidopsis Proteins / metabolism*
  • Gene Duplication*
  • Gene Expression Profiling
  • Genes, Duplicate / genetics
  • Mutation
  • Oligonucleotide Array Sequence Analysis
  • Organ Specificity
  • Plant Leaves / genetics
  • Plant Leaves / physiology
  • Plant Roots / genetics
  • Plant Roots / physiology
  • Promoter Regions, Genetic / genetics
  • Seedlings / genetics
  • Seedlings / physiology
  • Sequence Homology
  • Stress, Physiological

Substances

  • Arabidopsis Proteins