The Art Gallery Guardian

Applications of finger trees

1 Finger trees

Consider a data structure SS that maintains a sequence of monoid elements from (M,)(M,\cdot). Say, the sequence is a1,,ana_1,\ldots,a_n. It has four operations. All time are amortized and not worst case.

  1. split(S,p)split(S,p), p:M{0,1}p:M\to \{0,1\} is a monotonic predicate on MM, i.e. p(ab)p(a)p(ab)\geq p(a). It split the sequence at position tt, such that p(i=1jai)=0p(\prod_{i=1}^j a_i) = 0 for all j<tj < t, and p(i=1jai)=1p(\prod_{i=1}^j a_i) = 1 for all jtj\geq t. This takes O(logmin(n,nt))O(\log \min(n,n-t)) time.

  2. concat(S,T)concat(S,T), concatenate sequence represented by SS and the sequence represented by TT. If the length of the sequences are nn and mm, respectively, then this operation takes O(logmin(n,m))O(\log \min(n,m)) time.

  3. product(S)product(S), returns i=1nai\prod_{i=1}^n a_i in O(1)O(1) time.

  4. map(S,f)map(S,f), apply ff to every entry in the sequence. So SS would represent f(a1),,f(an)f(a_1),\ldots,f(a_n). This operation takes O(1)O(1) time.

There is one operation Empty()Empty() that produce an empty sequence.

We have to first make a few assumptions: All monoid operations take O(1)O(1) time. All map we consider can be evaluated in O(1)O(1) time AND we can compute h=fgh = f\circ g in O(1)O(1) time. Note this means we compute (a representation) of the function hh itself, and we can use this representation to compute h(x)h(x) in O(1)O(1) time.

Otherwise, all “O(T)O(T) time” should be replaced with “O(T)O(T) monoid operation, function evaluation, function composition and constant time operations.”.

Such data structure exists. The first 33 operation are supported by finger tree [1]. It is not hard to add the last operation, the idea is to tag nodes with an function ff(in the beginning ff is just the identity), and it means “apply ff to everything below!”. It propagate only when the lower level nodes need to be accessed, therefore the cost would be charged into the other operations. Many other binary search trees probably can implement these operations too. For example, if the dynamic optimality conjecture for splay tree is true, it be best to use splay tree for implementation. However, that’s beside the point. The point is to have an abstract data structure over a monoid sequence. The abstraction deals away with all the mess hidden by the data structure.

2 Extend its power

Finger tree can be extended to query elements by giving indices of the element: take the Cartesian product of the monoid and (N,+)(\N,+), and we get a new monoid. So we can just assume our data structure has the following extensions:

  1. splitAt(S,i)splitAt(S,i), split the sequence to two sequence AA and BB at the iith index in O(logmin(n,ni))O(\log \min(n,n-i)) time.

  2. insertAt(S,i,x)insertAt(S,i, x), insert element to the iith position.

  3. delete(S,i)delete(S,i), delete the element at iith position.

Another interesting extension is allowing one to reverse(S)reverse(S) the sequence in O(logn)O(\log n) time. This become much more complicated.

3 Application

There are some common application of finger trees.

  1. Stack, queue, dequeue with product operation, all amortized constant time. As a special case, it would solve the min stack problem.

  2. Random access sequence, which is related to the rope data structure.

  3. Ordered sequence(sorted list).

  4. Priority queues.

  5. Interval trees, segment trees.

Most of the above has been described in [1], but we will talk about two specifics that usually not mentioned by others.

3.1 Merge sorted lists in optimal time bound


If SS and TT are two ordered sequences of length nn and mm, respectively. nmn\leq m. Both ordered sequences are represented by finger trees. Compute the finger tree representation of STS\cup T takes O(nlogmn)O(n\log \frac{m}{n}) time.

Split TT into nn pieces one by one by split along the iith element of SS to the second part of the iith produced piece for all ii, then concatenate all of them. The splitting takes i=1nlogti\sum_{i=1}^n \log t_i time, where tit_i is the size of the iith piece. By concavity, we have the time for split is i=1nlogtinlogmn\sum_{i=1}^n \log t_i \leq n \log \frac{m}{n}, and concatenation time is similar.

3.2 Solve the Klee’s measure problem

This section shows how the map operation is quite crucial because we can make range updates.

The motivation came from the following question. Can finger tree substitute for the popular competitive programming data structure segment tree, a special case of the real segment tree? This is not possible, and it’s not because the large hidden constants. The abstract definition in this article goes cannot do the following common operation on an sequence of integers: increment the iith number by ii. This operation make little sense if we allow insert and deletions, but many use of segment tree do not consider insert and deletes.

Fortunately, Klee’s measure problem, the problem that caused the invention of segment tree can be solved with a finger tree. We show how to solve it in 1D. It’s well known how to use it for solving the nnD version.

We produce another data structure, such that each operation takes O(logn)O(\log n) amortized time. Where nn is the number of times we called insert.

We maintain a collection of intervals with data structure DD, there are 3 operations.

  1. insert(D,a,b)insert(D,a,b), insert interval (a,b)(a,b) into the collection.
  2. delete(D,a,b)delete(D,a,b). Delete a interval (a,b)(a,b) from the collection, we assume it’s always a interval inserted before.
  3. measure(D)measure(D). Find the measure of the union of all intervals in the collection.

It can be sharpened such that nn is the number of intervals inside the data structure. We just do a global rebuild if the number of interval doubled or reduced to half since the last global update. Also, measure()measure() actually only take constant time.

The finger tree stores a sequence of elementary intervals that partitions the space, and how many copies of that interval exists. Thus we can represent it as ((l,r),c)((l,r),c), where (l,r)(l,r) is the elementary interval with left and right boundary, and cc is the number of copies.

To insert a interval, consider it’s left point ll. The data structure find a elementary interval contains ll, and split it into two elementary intervals. Similarly, we do the same with the right endpoint. This takes O(logn)O(\log n) time, and we increase the number of elementary intervals by at most 22. incrementincrement is an automorphism defined as increment(((a,b),c))=((a,b),c+1)increment(((a,b),c))=((a,b),c+1). For each insertion of interval (a,b)(a,b), incrementincrement gets applied to all elementary interval in the range. For deletion, we apply decrement=increment1decrement = increment^{-1} instead.

The monoid product is simply i=1n((li,ri),c)=i=1nmin(c,1)(rili)\prod_{i=1}^n ( (l_i, r_i),c) = \sum_{i=1}^n \min(c,1)(r_i-l_i), and this is exactly what measure(D)measure(D) should output.

3.3 The local ranking sequence

The ranking sequence of a sequence of distinct numbers a1,,ana_1,\ldots,a_n is defined as b1,,bnb_1,\ldots,b_n, where aia_i is the bib_ith smallest element. Given a sequence of unique integers a1,,ana_1,\ldots,a_n, we want to design a data structure to query Q(i,j)Q(i,j), which returns the ranking sequence of ai,,aja_i,\ldots,a_j.

This can be solved by storing sorted subsequences. We just have to return sorted sequence of ai,,aja_i,\ldots,a_j during the query, and then it become simple to figure out the ranking sequence. This will make sure the running time for Q(i,j)Q(i,j) to be O(ji)O(j-i). There are a few other variants, but seems quite hard to adopt a finger tree for [2].


[1] R. Hinze, R. Paterson, Finger trees: A simple general-purpose data structure, J. Funct. Program. 16 (2006) 197–217 10.1017/S0956796805005769.

[2] C.-J. Chang, K.-M. Chao, Efficient algorithms for local ranking, Information Processing Letters. 112 (2012) 517–522 10.1016/j.ipl.2012.03.011.

Posted by Chao Xu on .
Tags: Data Structure, Algorithm.