Finding a most parsimonious or likely tree in a network with respect to an alignment

Steven Kelk; Fabio Pardi; Celine Scornavacca; Leo van Iersel

doi:10.1007/s00285-018-1282-2

Finding a most parsimonious or likely tree in a network with respect to an alignment

J Math Biol. 2019 Jan;78(1-2):527-547. doi: 10.1007/s00285-018-1282-2. Epub 2018 Aug 19.

Authors

Steven Kelk¹, Fabio Pardi², Celine Scornavacca³, Leo van Iersel⁴

Affiliations

¹ Department of Data Science and Knowledge Engineering (DKE), Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands. steven.kelk@maastrichtuniversity.nl.
² LIRMM, Université de Montpellier, CNRS, Montpellier, France.
³ Institut des Sciences de l'Evolution, CNRS, IRD, EPHE, Institut de Biologie Computationnelle (IBC), Université de Montpellier, 34095, Montpellier Cedex 5, France.
⁴ Delft Institute of Applied Mathematics, Delft University of Technology, Van Mourik Broekmanweg 6, 2628 XE, Delft, The Netherlands.

Abstract

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

Keywords: APX-hardness; Maximum likelihood; Maximum parsimony; NP-hardness; Phylogenetic network; Phylogenetic tree.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Animals
Computational Biology
Evolution, Molecular
Genetic Speciation
Humans
Mathematical Concepts
Models, Genetic*
Phylogeny*

Abstract

Publication types

MeSH terms

Grants and funding