Supermatrices, supertrees and serendipitous scaffolding: inferring a well-resolved, genus-level phylogeny of Styphelioideae (Ericaceae) despite missing data

Mol Phylogenet Evol. 2012 Jan;62(1):146-58. doi: 10.1016/j.ympev.2011.09.011. Epub 2011 Sep 22.

Abstract

For the predominantly southern hemisphere plant group Styphelioideae (Ericaceae) published sequence datasets of five markers are now available for all except one of the 38 recognised genera. However, several markers are highly incomplete therefore missing data is problematic for producing a genus level phylogeny. We explore the relative utility of supertree and supermatrix approaches for addressing this challenge, and examine the effects of missing data on tree topology and resolution. Although the supertree approach returned a more conservative hypothesis, overall, both supermatrix and supertree analyses concurred in the topologies they returned. Using multiple genes and a dataset of variably complete taxa we found improved support for the monophyly and position of the tribes and genus level relationships. However, there was mixed support for the Richeeae tribe appearing one node basal to the Cosmelieae tribe or vice versa. It is probable that this will only be resolved through further sequencing. Our study supports previous findings that the amount of data is more critical than the completeness of the dataset in estimating well-resolved trees. Our results suggest that a "serendipitous" scaffolding approach that includes a mixture of well and poorly sequenced taxa can lead to robust phylogenetic hypotheses.

MeSH terms

  • Bayes Theorem
  • DNA, Ribosomal Spacer / genetics
  • Ericaceae / classification*
  • Ericaceae / genetics*
  • Likelihood Functions
  • Models, Genetic
  • Multilocus Sequence Typing
  • Phylogeny*
  • Plant Proteins / genetics
  • RNA, Ribosomal, 18S / genetics
  • Sequence Alignment

Substances

  • DNA, Ribosomal Spacer
  • Plant Proteins
  • RNA, Ribosomal, 18S