Distinguishing terminal monophyletic groups from reticulate taxa: performance of phenetic, tree-based, and network procedures

Syst Biol. 2007 Apr;56(2):302-20. doi: 10.1080/10635150701324225.

Abstract

Hybridization is a well-documented, natural phenomenon that is common at low taxonomic levels in the higher plants and other groups. In spite of the obvious potential for gene flow via hybridization to cause reticulation in an evolutionary tree, analytical methods based on a strictly bifurcating model of evolution have frequently been applied to data sets containing taxa known to hybridize in nature. Using simulated data, we evaluated the relative performance of phenetic, tree-based, and network approaches for distinguishing between taxa with known reticulate history and taxa that were true terminal monophyletic groups. In all methods examined, type I error (the erroneous rejection of the null hypothesis that a taxon of interest is not monophyletic) was likely during the early stages of introgressive hybridization. We used the gradual erosion of type I error with continued gene flow as a metric for assessing relative performance. Bifurcating tree-based methods performed poorly, with highly supported, incorrect topologies appearing during some phases of the simulation. Based on our model, we estimate that many thousands of gene flow events may be required in natural systems before reticulate taxa will be reliably detected using tree-based methods of phylogeny reconstruction. We conclude that the use of standard bifurcating tree-based methods to identify terminal monophyletic groups for the purposes of defining or delimiting phylogenetic species, or for prioritizing populations for conservation purposes, is difficult to justify when gene flow between sampled taxa is possible. As an alternative, we explored the use of two network methods. Minimum spanning networks performed worse than most tree-based methods and did not yield topologies that were easily interpretable as phylogenies. The performance of NeighborNet was comparable to parsimony bootstrap analysis. NeighborNet and reverse successive weighting were capable of identifying an ephemeral signature of reticulate evolution during the early stages of introgression by revealing conflicting phylogenetic signal. However, when gene flow was topologically complex, the conflicting phylogenetic signal revealed by these methods resulted in a high probability of type II error (inferring that a monophyletic taxon has a reticulate history). Lastly, we present a novel application of an existing nonparametric clustering procedure that, when used against a density landscape derived from principal coordinate data, showed superior performance to the tree-based and network procedures tested.

MeSH terms

  • Classification / methods
  • Computer Simulation
  • Evolution, Molecular
  • Gene Flow*
  • Hybridization, Genetic
  • Models, Genetic
  • Phylogeny*