The Art Gallery Guardian

Pattern in Labeled Ordered Rooted Trees


Problem1

Let TT be a rooted ordered labeled tree. Find all the vertices where all it’s subtrees are equal.

Let T(v)T(v) to denote the subtree rooted at vv. The two trees are equal if they have the same shape and the same label.

1 Reduce to a string problem

This problem is interesting because one solution can demonstrate the technique of linearize the ordered tree to a string, and apply string algorithms.

First, we replace every edge in the tree with two directed edges uvuv and vuvu, where uu is closer to the root than vv. We label uvuv with (( and vuvu with )). This will be the new tree we work with.

Let s(T)s(T) be a string defined by concatenating the labels on the path by traverse the tree with an euler tour by following the edges in a DFS like manner starting from the root of TT.

Note s(T)s(T) would be a balanced set of parenthesis when ll maps vertices to the empty string. Indeed, there is a bijection between unlabeled ordered trees and balanced parenthesis. It’s not hard to see this generalizes to the labeled setting.

Theorem2

If s(T)=s(T)s(T)=s(T'), then T=TT=T'.

A run in a string is a maximal string of the form anba^nb, where bb is a prefix of aa and n2n\geq 2. The runs theorem states there are at most O(n)O(n) runs, and all of them can be found in O(n)O(n) time[1]. Let anba^nb be a run in a string, then we call the ana^n part the complete repetitions.

Define the vertices with at least 22 child and all it’s subtrees are equal as good vertices. It’s easy to s(T(v))s(T(v)) for some good vertex vv is going to be a complete repetition!

Now if we found a run, it’s easy to check if it actually correspond to a good vertex in O(1)O(1) time once we did a O(n)O(n) time preprocessing.

This allows us to solve the problem in O(n)O(n) time.

Alternatively, there is a paper with a linear time algorithm to find all subtree repeats inside a tree[2].

Reference

[1] M. Crochemore, L. Ilie, Computing longest previous factor in linear time and applications, Information Processing Letters. 106 (2008) 75–80.

[2] T. Flouri, K. Kobert, S. Pissis, A. Stamatakis, An optimal algorithm for computing all subtree repeats in trees, in: T. Lecroq, L. Mouchard (Eds.), Combinatorial Algorithms, Springer Berlin Heidelberg, 2013: pp. 269–282 10.1007/978-3-642-45278-9_23.

Posted by Chao Xu on .
Tags: algorithm.