Peripheral structures in unlabelled trees and the accumulation of subgenomes in the evolution of polyploids

J Theor Biol. 2022 Jan 7:532:110924. doi: 10.1016/j.jtbi.2021.110924. Epub 2021 Oct 7.

Abstract

Many angiosperms have undergone some series of polyploidization events over the course of their evolutionary history. In these genomes, especially those resulting from multiple autopolyploidization, it may be relatively easy to recognize all the ξ sets of n homeologous chromosomes, but it is much harder, if not impossible, to partition these chromosomes into n subgenomes, each representing one distinct genomic component of ξ chromosomes making up the original polyploid. Thus, if we wish to infer the polyploidization history of the genome, we could make use of all the gene trees inferred from the genes in one set of homeologous chromosomes to construct a consensus tree, but there is no evident way of combining the trees from the ξ different sets, because we have no labelling of the chromosomes that is known to be consistent across these sets. We suggest here that lacking a consistent leaf-labelling, the topological structure of the trees may display sufficient resemblance so that a higher level consensus could be revealing of evolutionary history. This would be especially true of the peripheral structures of the tree, likely representing events that occurred more recently and have thus been less obscured by subsequent evolutionary processes. Here, we present a statistical test to assess whether the subgenomes in a polyploid genome could have been added one at a time. The null hypothesis is that the accumulation of chromosomes follows a stochastic process in which transition from one generation to the next is through randomly choosing an edge, and then subdividing this edge in order to link the new internal vertex to a new external vertex. We analyze the probability distributions of a number of peripheral tree substructures, namely leaf- or terminal-pairs, triples and quadruples, arising from this stochastic process, in terms of some exact recurrences. We propose some conjectures regarding the asymptotic behaviours of these distributions. Applying our analysis to a sugarcane genome, we demonstrate that it is unlikely that the accumulation of subgenomes has occurred one at a time.

Keywords: Comparative genomics; Peripheral structures; Plant phylogeny; Polyploidization; Recurrence; Unlabelled trees.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Magnoliopsida*
  • Phylogeny
  • Polyploidy*