Collecting reliable clades using the Greedy Strict Consensus Merger

PeerJ. 2016 Jun 28:4:e2172. doi: 10.7717/peerj.2172. eCollection 2016.

Abstract

Supertree methods combine a set of phylogenetic trees into a single supertree. Similar to supermatrix methods, these methods provide a way to reconstruct larger parts of the Tree of Life, potentially evading the computational complexity of phylogenetic inference methods such as maximum likelihood. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. Many supertree methods have been developed. Some of them solve NP-hard optimization problems like the well-known Matrix Representation with Parsimony, while others have polynomial worst-case running time but work in a greedy fashion (FlipCut). Both can profit from a set of clades that are already known to be part of the supertree. The Superfine approach shows how the Greedy Strict Consensus Merger (GSCM) can be used as preprocessing to find these clades. We introduce different scoring functions for the GSCM, a randomization, as well as a combination thereof to improve the GSCM to find more clades. This helps, in turn, to improve the resolution of the GSCM supertree. We find this modifications to increase the number of true positive clades by 18% compared to the currently used Overlap scoring.

Keywords: Consensus; Divide and Conquer; FlipCut; Phylogeny; Supermatrix; Supertree.

Grants and funding

Markus Fleischauer is supported by Deutsche Forschungsgemeinschaft, project BO 1910/12. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.