NCHB: A method for constructing rooted phylogenetic networks from rooted triplets based on height function and binarization

J Theor Biol. 2020 Mar 21:489:110144. doi: 10.1016/j.jtbi.2019.110144. Epub 2020 Jan 3.

Abstract

Phylogenetics is a field that studies and models the evolutionary history of currently living species. The rooted phylogenetic network is an important approach that models non-tree-like events between currently living species. Rooted triplets are one type of inputs in constructing rooted phylogenetic networks. Constructing an optimal rooted phylogenetic network that contains all given rooted triplets is a NP-hard problem. To overcome this challenge efficiently, a novel heuristic method called NCHB is introduced in this paper. NCHB produces an optimal rooted phylogenetic network that covers all given rooted triplets. The NCHB optimality criterions in building a rooted phylogenetic network are minimizing the number of reticulation nodes, and minimizing the level of the final network. In NCHB, the two concepts: the height function and the binarization of a network are considered innovatively. In order to study the performance of NCHB, our proposed method is compared with the three state of the art algorithms that are LEV1ATHAN, SIMPLISTIC and TripNet in two scenarios. In the first scenario, triplet sets are generated under biological presumptions and our proposed method is compared with SIMPLISTIC and TripNet. The results show that NCHB outperforms TripNet and SIMPLISTIC according to the optimality criterions. In the second scenario, we designed a software for generating level-k networks. Then all triplets consistent with each network are obtained and are used as input for NCHB, LEV1THAN, SIMPLISTIC, and TripNet. LEV1ATHEN is just applicable for level-1 networks while the other algorithms can be performed to obtain higher level networks. The results show that the NCHB and LEV1ATHAN outputs are almost the same when we are restricted to level-1 networks. Also the results show that NCHB outperforms TripNet and SIMPLISTIC. Moreover NCHB outputs are very close to the generated networks (that are optimal) with respect to the criterions.

Keywords: Bioinformatics; Consistency; Density; Height function; Living species; NP-hard; Reticulation node; Rooted phylogenetic network; Rooted triplet.

MeSH terms

  • Algorithms*
  • Biological Evolution
  • Heuristics
  • Models, Genetic
  • Phylogeny
  • Software*