RECOVERING A TREE FROM THE LENGTHS OF SUBTREES SPANNED BY A RANDOMLY CHOSEN SEQUENCE OF LEAVES

Adv Appl Math. 2018 May:96:39-75. doi: 10.1016/j.aam.2018.01.001. Epub 2018 Feb 28.

Abstract

Given an edge-weighted tree T with n leaves, sample the leaves uniformly at random without replacement and let Wk , 2 ≤ kn, be the length of the subtree spanned by the first k leaves. We consider the question, "Can T be identified (up to isomorphism) by the joint probability distribution of the random vector (W2, …, Wn )?" We show that if T is known a priori to belong to one of various families of edge-weighted trees, then the answer is, "Yes." These families include the edge-weighted trees with edge-weights in general position, the ultrametric edge-weighted trees, and certain families with equal weights on all edges such as (k + 1)-valent and rooted k-ary trees for k ≥ 2 and caterpillars.

Keywords: graph isomorphism; phylogenetic diversity; random tree; tree reconstruction.