The Generalized Robinson-Foulds Distance for Phylogenetic Trees

J Comput Biol. 2021 Dec;28(12):1181-1195. doi: 10.1089/cmb.2021.0342. Epub 2021 Oct 29.

Abstract

The Robinson-Foulds (RF) distance, one of the most widely used metrics for comparing phylogenetic trees, has the advantage of being intuitive, with a natural interpretation in terms of common splits, and it can be computed in linear time, but it has a very low resolution, and it may become trivial for phylogenetic trees with overlapping taxa, that is, phylogenetic trees that share some but not all of their leaf labels. In this article, we study the properties of the Generalized Robinson-Foulds (GRF) distance, a recently proposed metric for comparing any structures that can be described by multisets of multisets of labels, when applied to rooted phylogenetic trees with overlapping taxa, which are described by sets of clusters, that is, by sets of sets of labels. We show that the GRF distance has a very high resolution, it can also be computed in linear time, and it is not (uniformly) equivalent to the RF distance.

Keywords: Robinson-Foulds distance; metrics; phylogenetic tree.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Classification / methods*
  • Computational Biology / methods*
  • Models, Genetic
  • Phylogeny