The Art Gallery Guardian

Speed up incremental computation with two stacks

1 Introduction

:: Problem Consider a knapsack of capacity CC and a empty sequence of objects. One can update the sequence by add or delete objects from either end of the sequence. Construct a data structure, such that we can output the maximum possible value of the knapsack after every update if we pack the knapsack using the objects in the sequence. ::: This is a generalization of the online knapsack problem, which is an generalization of a hard offline knapsack problem used by Quora.

A naive algorithm just recompute the knapsack each time using a dynamic programming algorithm. This would take O(mnC)O(mnC) time, where mm is the total number of updates, nn is the maximum number of objects in the sequence. The rest of the article describe the idea of decomposable function, which lead to an solution that solves this problem in O(mC)O(mC) time.

2 Decomposable function


A function f:i=1XiYf:\cup_{i=1}^\infty X^i \to Y is called (,,)(\triangleright,\bowtie,\triangleleft)-decomposable, if for all 1kn1 \leq k \leq n, f(x1,,xn)=(x1xkid)(idxk+1xn) f(x_1,\ldots,x_n)=(x_1 \triangleright \ldots \triangleright x_k \triangleright id_{\triangleright}) \bowtie (id_{\triangleleft} \triangleleft x_{k+1}\triangleleft \ldots \triangleleft x_n) where

1. $\triangleright:X\times Y_{\triangleright}\to Y_{\triangleright}$ is right associative.
2. $\triangleleft:Y_{\triangleleft} \times X\to Y_{\triangleleft}$ is left associative. 
3. $\bowtie:Y_{\triangleright}\times Y_{\triangleleft}\to Y$.

Let ff be a (,,)(\triangleright,\bowtie,\triangleleft)-decomposable function. Let SS be a finite sequence of elements, such that SS is in the domain of ff. Dynamically output f(S)f(S) after every deque operation on SS, such that we call ,\triangleright,\bowtie and \triangleleft amortized constant number of times per operation.

We will use i=1nxi\bigtriangleright_{i=1}^n x_i to mean x1xnidx_1 \triangleright \ldots \triangleright x_n \triangleright id_{\triangleright}, and i=1nxi\bigtriangleleft_{i=1}^n x_i to mean idx1xnid_{\triangleright} \triangleleft x_1 \triangleleft \ldots \triangleleft x_n.

ff is called decomposable if there exist (,,)(\triangleright,\bowtie,\triangleleft) such that it’s decomposable.

The intuition is the function ff can be decomposed into solving two pieces of problems. Each piece of the problem has a incremental nature.

DC(x1,,xn)D_C(x_1,\ldots,x_n), the maximum value possible given objects x1,,xnx_1,\ldots,x_n and a knapsack of capacity CC is a decomposable function. If x1xnx_1\triangleright \ldots \triangleright x_n produces the last row of the common dynamic programming algorithm for the knapsack, then we can let \bowtie to be an operation that combine the last rows to output a value.

3 Implement the data structure

Intuitively, \triangleright and \triangleleft produces two stacks. \bowtie combines the information on the stack to produce the solution to ff.

3.1 \triangleleft, a stack


We have a stack of elements SS, dynamically maintain foldlfoldl \triangleleft in O(1)O(1) \triangleleft per operation.

Say S=x1,,xnS=x_1,\ldots,x_n. We store i=1kxi\bigtriangleleft_{i=1}^k x_i for all 1kn1\leq k\leq n, and whenever there is an insertion, we compute i=1n+1xi=i=1nxixn+1\bigtriangleleft_{i=1}^{n+1} x_i = \bigtriangleleft_{i=1}^n x_i \triangleleft x_{n+1} in one monoid operation. Deletion can be handled by discarding a value. In the worst case, \triangleleft gets called only once.

But this requires us to store O(n)O(n) values after the fold. We can decrease the extra space to store only O(n)O(\sqrt{n}) prefix sums.

Let S(k)=i=1kxiS(k) = \bigtriangleleft_{i=1}^k x_i. Originally we store S(k)S(k) for all 1kn1 \leq k\leq n. We show how we store only when kk is a perfect square and around n\sqrt{n} other elements. If k2n<(k+1)2k^2\leq n<(k+1)^2, we make sure we store S(i)S(i) for all ii between k2k^2 and nn, and do not store any non perfect square ii smaller than (k1)2(k-1)^2(this means we actively clean them up as nn grows large). Assume we are deleting, we can delete around n\sqrt{n} elements before we hit a perfect square, in that case we would need to recompute the sums from the previous perfect square. It’s not hard to see some amortized analysis, the extra space can be made into O(n)O(\sqrt{n}).

Actually, there is a entire range of space/time trade-offs possible. O(1/ϵ)O(1/\epsilon) amortized time per operation and O(nϵ)O(n^{\epsilon}) extra space [1].

3.2 \bowtie, combine two stacks to simulate a deque

First, let’s consider we are allowed to add on both side of the sequence, but deletion is only at the beginning of the sequence.

The idea is to build this through two stacks. This is a very common problem, and we can see the solutions here. One thing to remember is that the left stack uses \triangleright, the right stack uses \triangleleft.

We can try to simulate the deque with 3 stacks(where nn deque operation maps to 9n9n stack operations) [2], but that is just an interesting exercise. We just need to use our two stack set up as in the queue and occasionally rebuild everything. When our front stack become empty and we remove an element in front of our sequence, we just rebuild the structure with two stacks of the same size. There are ways to do the rebuilding without use extra space. The case on the other side is handled symmetrically. Therefore we still maintain O(1)O(1) amortized monoid time per operation and O(n)O(\sqrt{n}) extra space. Finally, we combine the result through \bowtie.


There exist worst case constant time simulation of deque using a few stacks [2]. Thus it is conceivable to make everything from amortized to worst case, with obvious increase in space.

4 Examples

I have the code sample for the entire framework along with the examples. There are only 66 functions.

emptyDecomposableDequeue initialize the data structure with the corresponding functions. pushFront, pushBack, popFront and popBack manipulates the sequences. measure returns the result after apply our decomposable function to the sequence.

4.1 Knapsack

Our common dynamic programming formulation is to sequentially compute tables DSD_S, such that DS[C]D_S[C] contains the maximum value picking objects from SS with knapsack capacity CC.

Usually, we order the objects as x1,,xnx_1,\ldots,x_n, and Si={x1,,xi}S_i=\{x_1,\ldots,x_i\}. We compute table DSi+1D_{S_{i+1}} from DSiD_{S_i} in O(C)O(C) time. This should be the operation for \triangleright and \triangleleft. To get the particular entry DST[C]D_{S\cup T}[C], O(C)O(C) time suffice if we are given table for DSD_S and DTD_T. This would be the \bowtie operation.

This implies if the sequence size is at most nn, then for any sequence of mm operations, we can dynamically compute all ff in O(mC)O(mC) time using only O(nC)O(\sqrt{n}C) space. Notice once we convert the general algorithm to work in this special case, it is essentially the solution by Lei Huang. In fact, this general data structure is inspired by his solution.

4.2 Maximum subarray sum

The common dynamic programming problem of finding the maximum sum in an array(or, here we consider as a sequence) with both negative and positive numbers. One can compute it in linear time with Kadane’s algorithm. This is a decomposable function, too.

Let i=jkxi=(a,b,c,d)\bigtriangleright_{i=j}^k x_i=(a,b,c,d), where aa is the maximum sum using the beginning, bb is the sum of all elements from the beginning, cc is the maximum using the current element and dd is the maximum seen so far. \triangleleft is defined similarly. One can see that (a,b,c,d)(a,b,c,d)=max(a+a,d,d)(a,b,c,d')\bowtie (a',b',c',d') = \max(a+a',d,d').

4.3 Dynamic sum of a sequence of elements


Let x1,,xnx_1,\ldots,x_n be a finite sequence of elements from a monoid (M,+)(M,+). Dynamically maintain the sum of all element in the sequence if we can add and delete element in both end of the sequence.

A simpler version of this problem is asked in CS theory, where we are allowed to add in one end and delete in another.

Let ff be the sum of elements in the sequence, then ff is (+,+,+)(+,+,+)-decomposable. There is a more general statement:


ff is a homomorphism, then it is decomposable.

Of course, there is already a data structure support this–a finger tree [3]. Because of the monoid structure, we can even allow concatenations.


[1] S. Swamy, J. Savage, Space-time tradeoffs for linear recursion, Mathematical Systems Theory. 16 (1983) 9–27 10.1007/BF01744566.

[2] H. Petersen, Stacks versus deques, in: J. Wang (Ed.), Computing and Combinatorics, Springer Berlin Heidelberg, 2001: pp. 218–227 10.1007/3-540-44679-6_24.

[3] R. Hinze, R. Paterson, Finger trees: A simple general-purpose data structure, J. Funct. Program. 16 (2006) 197–217 10.1017/S0956796805005769.

Posted by Chao Xu on .
Tags: algorithmic toolkit.