Mining transcriptomic data to study the origins and evolution of a plant allopolyploid complex

PeerJ. 2014 May 20:2:e391. doi: 10.7717/peerj.391. eCollection 2014.

Abstract

Allopolyploidy combines two progenitor genomes in the same nucleus. It is a common speciation process, especially in plants. Deciphering the origins of polyploid species is a complex problem due to, among other things, extinct progenitors, multiple origins, gene flow between different polyploid populations, and loss of parental contributions through gene or chromosome loss. Among the perennial species of Glycine, the plant genus that includes the cultivated soybean (G. max), are eight allopolyploid species, three of which are studied here. Previous crossing studies and molecular systematic results from two nuclear gene sequences led to hypotheses of origin for these species from among extant diploid species. We use several phylogenetic and population genomics approaches to clarify the origins of the genomes of three of these allopolyploid species using single nucleotide polymorphism data and a guided transcriptome assembly. The results support the hypothesis that all three polyploid species are fixed hybrids combining the genomes of the two putative parents hypothesized on the basis of previous work. Based on mapping to the soybean reference genome, there appear to be no large regions for which one homoeologous contribution is missing. Phylogenetic analyses of 27 selected transcripts using a coalescent approach also are consistent with multiple origins for these allopolyploid species, and suggest that origins occurred within the last several hundred thousand years.

Keywords: NGS; Phylogenetics; Polyploidy; Population genomics.

Grants and funding

We received longstanding support from the US National Science Foundation for our research on Glycine, most recently awards 0822258 and 0939423. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.