Homoeologous gene expression and co-expression network analyses and evolutionary inference in allopolyploids

Brief Bioinform. 2021 Mar 22;22(2):1819-1835. doi: 10.1093/bib/bbaa035.

Abstract

Polyploidy is a widespread phenomenon throughout eukaryotes. Due to the coexistence of duplicated genomes, polyploids offer unique challenges for estimating gene expression levels, which is essential for understanding the massive and various forms of transcriptomic responses accompanying polyploidy. Although previous studies have explored the bioinformatics of polyploid transcriptomic profiling, the causes and consequences of inaccurate quantification of transcripts from duplicated gene copies have not been addressed. Using transcriptomic data from the cotton genus (Gossypium) as an example, we present an analytical workflow to evaluate a variety of bioinformatic method choices at different stages of RNA-seq analysis, from homoeolog expression quantification to downstream analysis used to infer key phenomena of polyploid expression evolution. In general, EAGLE-RC and GSNAP-PolyCat outperform other quantification pipelines tested, and their derived expression dataset best represents the expected homoeolog expression and co-expression divergence. The performance of co-expression network analysis was less affected by homoeolog quantification than by network construction methods, where weighted networks outperformed binary networks. By examining the extent and consequences of homoeolog read ambiguity, we illuminate the potential artifacts that may affect our understanding of duplicate gene expression, including an overestimation of homoeolog co-regulation and the incorrect inference of subgenome asymmetry in network topology. Taken together, our work points to a set of reasonable practices that we hope are broadly applicable to the evolutionary exploration of polyploids.

Keywords: RNA-seq; allopolyploid; co-expression gene network; differential expression; homoeolog-specific read partitioning.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Datasets as Topic
  • Evolution, Molecular*
  • Gene Expression Regulation, Plant*
  • Genes, Plant
  • Gossypium / genetics
  • Polyploidy*
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA / methods

Substances

  • RNA, Messenger