The Art Gallery Guardian

Subset sum through balancing


This is a note for Pisinger’s balancing algorithm for subset sum [1]. Let S\mathcal{S} be the set of all subset sums of SS. The subset sum problem, the input is SNS\subset \N, and we are interested in checking if tSt\in \mathcal{S}.

We define a variation of the subset sum problem. The balanced subset sum problem. In this problem, we are given a vector vv of integers(does not have to be positive). We let M=vM=\|v\|_\infty. We are interested in find a subset that sums to t[M]t\in [M].

Theorem1

Each subset sum problem on nn elements can be reduced to a balanced subset sum problem in nn elements in O(n)O(n) time.

Proof

Consider the input to the subset sum problem SS and tt. Greedily find a subset of elements SS', such that adding any other element will exceed tt. Let S1=t\|S'\|_1=t'. Now, we negate all the elements in SS', and ask for balanced subset sum with input set S(SS)-S' \cup (S\setminus S') and target number ttt-t'.

We partition SS into A=[M..0]SA = [-M..0]\cap S and B=SAB=S\setminus A. We also define Ai={a1,,ai}A_i = \set{a_1,\ldots,a_i} and Bi={b1,,bi}B_i=\set{b_1,\ldots,b_i}.

A set is balanced by the following recursive definition. Let SS be a set.

  • S=S=\emptyset is balanced.
  • S1>t\|S\|_1> t, then S{a}S\cup \set{a} is balanced, where aAa\in A.
  • S1t\|S\|_1\leq t, then S{b}S\cup \set{b} is balanced, where bBb\in B.

Consider a set TT, such that (i,j,k)T(i,j,k)\in T if and only if kk is a subset sum of AiBjA_i\cup B_j. Certainly, we are interested if (A,B,t)(|A|,|B|,t) is in TT. However, the state space is already O(n2M)O(n^2M), which is no better than the standard dynamic programming algorithm.

There is a nice dominance relation. If (i,j,k)T(i,j,k)\in T, then for (i,j)(i,j)(i',j')\geq (i,j), we have (i,j,k)T(i',j',k)\in T. We can ask for each kk, what are all the minimal (i,j)(i,j) pairs where (i,j,k)T(i,j,k)\in T. Such value will be g(j,k)g(j,k). Formally, g(j,k)=min{i(i,j,k)T}g(j,k) = \min \set{i | (i,j,k)\in T}, one can see that g(j,k)g(j+1,k)g(j,k) \geq g(j+1,k). Also, we know the solution corresponding to g(j,k)g(j,k) must contain ag(j,k)a_{g(j,k)} as an element.

One can get a recurrence relation for gg as below.

g(j,k)=min{g(j1,k)g(j1,kbj)if kbjtiif kai>t and i>g(j,kai) g(j,k)= \min \begin{cases} g(j-1,k)\\ g(j-1,k-b_j) & \text{if }k-b_j\leq t\\ i & \text{if }k-a_i > t \text{ and } i>g(j,k-a_i) \end{cases}

Fix a kk and jj, let ii to be as small as possible, such that there is AiAiA_i'\subset A_i and BjBjB_j'\subset B_j such that AiBjA_i'\cup B_j' is balanced and sums to kk. Note that aiAia_i\in A_i'.

We obtained AiBjA_i'\cup B_j' by inserting an element in BB or AA to another balanced set. If the inserted element is in BB, but not bjb_j, then we know i=g(j1,k)i=g(j-1,k). If it is bjb_j, then i=g(j1,kbj)i=g(j-1,k-b_j). If the last inserted is aia_i, then g(j,k)=ig(j,k)=i. Note we observe in this case, g(j,kai)<ig(j,k-a_i)<i. A direct dynamic programming implementation seems to imply a O(n2M)O(n^2M) time algorithm, since there does not seem to be a quick way to obtain ii.

On the other hand, if we think bottom up instead of top down, we can obtain a better result. Below is the algorithm.

The algorithm

The value D[j,k]D[j,k] eventually equals g(j,k)g(j,k). Note we can argue the running time is O(nM)O(nM), since for each fixed kk, the final for loop can ran at most nn times. It is frustrating that the DP algorithm cannot be inferred directly from the recurrence relation. Indeed, we mainly obtained this through the fact that we can prune the search space if we start bottom up, which is unclear from the recurrence relation.

References

[1] D. Pisinger, Linear time algorithms for knapsack problems with bounded weights, Journal of Algorithms. 33 (1999) 1–14 10.1006/jagm.1999.1034.

Posted by Chao Xu on .
Tags: algorithms, subset sum.