An optimal algorithm for computing all subtree repeats in trees

Philos Trans A Math Phys Eng Sci. 2014 Apr 21;372(2016):20130140. doi: 10.1098/rsta.2013.0140. Print 2014 May 28.

Abstract

Given a labelled tree T, our goal is to group repeating subtrees of T into equivalence classes with respect to their topologies and the node labels. We present an explicit, simple and time-optimal algorithm for solving this problem for unrooted unordered labelled trees and show that the running time of our method is linear with respect to the size of T. By unordered, we mean that the order of the adjacent nodes (children/neighbours) of any node of T is irrelevant. An unrooted tree T does not have a node that is designated as root and can also be referred to as an undirected tree. We show how the presented algorithm can easily be modified to operate on trees that do not satisfy some or any of the aforementioned assumptions on the tree structure; for instance, how it can be applied to rooted, ordered or unlabelled trees.

Keywords: subtree repeats; tree data structures; unrooted unordered labelled trees.