The Art Gallery Guardian

List the smallest kk subset sums


Problem1

Given a set of positive reals {x1,,xn}\set{x_1,\ldots,x_n} where x1<x2<<xnx_1<x_2<\ldots<x_n, find the smallest kk subset sums.

We can assume nkn\leq k, because we do not have to read xjx_j if j>kj>k.

Torsten Gross and Nils Blüthgen posted a O(k2)O(k^2) time solution on arXiv.

We show a O(klogk)O(k\log k) time algorithm, which is optimal if we want to output the numbers in order.

We list the sums one by one by maintaining a priority queue of sums. We start with the empty set. Assume that we added the sum induced by I[n]I\subset [n] (that is, iIxi\sum_{i\in I} x_i) into the output, let j=1+maxIj=1+\max I. Now we can consider two possibilities by extending the current solution: the sum induced by I{j}I\cup \set{j} or the sum induced by I{k}I\cup \set{k} where k>jk>j. We will add both possibilities to the queue so that one can inspect them later. We can avoid storing the sets, only the values are required.

Here is a python implementation.

def first_k_subset_sums(x,k):
    n = len(x)
    h = []
    output = [0] # need to account for the empty set
    heapq.heappush(h,(x[0],0))
    while h and len(output)<k:
        (u,b) = heapq.heappop(h)
        output.append(u)
        if b+1<n:
            heapq.heappush(h,(u+x[b+1],b+1))
            heapq.heappush(h,((u-x[b])+x[b+1],b+1))
    return output

If we want to output the sets themselves, not just the values, does the running time change? If a set II is in the output, then all subsets of II must also be in the output. Hence the largest set we can ever output has size O(logk)O(\log k). Therefore the total output length is at most O(klogk)O(k\log k).

This is also a lower bound. Consider when xi=2ix_i=2^i, then we will output all subsets of {x1,,xlogk}\set{x_1,\ldots,x_{\log k}}, and we know that i=1logki(logki)=Ω(logk)\sum_{i=1}^{\log k} i{\log k\choose i} = \Omega(\log k).

If we don’t have to list the smallest kk subset sum values in order, then O(k)O(k) is possible, see this mathoverflow answer by David Eppstein.

If we are interested in the smallest kk distinct subset sum. I don’t know of any algorithm that performs better than O(nk)O(nk), even if we know that n=Ω(k)n=\Omega(k).

Acknowledgements

I would like to thank Tana Wattanawaroon for helpful discussions and taking an interest in this problem.

Posted by Chao Xu on .
Tags: algorithm.