Finding a most parsimonious or likely tree in a network with respect to an alignment

J Math Biol. 2019 Jan;78(1-2):527-547. doi: 10.1007/s00285-018-1282-2. Epub 2018 Aug 19.

Abstract

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

Keywords: APX-hardness; Maximum likelihood; Maximum parsimony; NP-hardness; Phylogenetic network; Phylogenetic tree.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology
  • Evolution, Molecular
  • Genetic Speciation
  • Humans
  • Mathematical Concepts
  • Models, Genetic*
  • Phylogeny*