The Art Gallery Guardian

Sum over products of weighted subset of certain size


Consider a commutative semiring (R,+,)(R,+,\cdot). 0\mathbb{0} is the identity for (R,+)(R,+), and 1\mathbb{1} is the identity for (R,)(R,\cdot). Let f,g:VRf,g:V\to R, w:VNw:V\to \mathbb{N} and ZNZ\subset \mathbb{N}. It is common that we are interested in computing expressions of the following form.

SV,xSw(x)ZxSf(x)xV\Sg(x)\displaystyle \sum_{S\subset V, \sum_{x\in S} w(x) \in Z} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x)

Examples:

  1. If w(x)=1w(x)=1 for all xx, and f(x)f(x) be the probability that event xx occurs, g=1fg=1-f, we find the probability that the number of event occurs tt times, where tZt\in Z. In probability, this is computing the Poisson distribution.

  2. If (R,+,)=(N,+,)(R,+,\cdot) = (\N,+,\cdot), f=g=1f=g=1, for all xx and w(x)=xw(x)=x and VNV\subset \N and Z={t}Z=\{t\}, then we find the number of subsets that have element sum tt.

  3. If (R,+,)=(N,max,+)(R,+,\cdot) = (\N,\max,+), VNV\subset \N, g=0g=0 and Z={0,,W}Z=\{0,\ldots,W\}, then this solves the knapsack problem with knapsack size WW, value ff and cost ww.

  4. An actual application inspired this post: An automated test suite that runs nn subtests, and it is allowed to rerun a subtest if it fails the first time. A subtest passes if first run passes or the rerun passes. The test is successful if all the subtests passes and the number of total reruns is at most kk. Assume probability of passing is independent for each subtest. One want to estimate the probability of a successful test given the probability a run passes for a specific subtest.

Let maxZ=k\max Z = k and V=n|V| = n. The naive algorithm runs in O(n2n)O(n2^n) time (assuming semiring operation takes O(1)O(1) time). There is a common transformation that turns this problem that sum over all subsets to a problem that sums over ZZ. So it runs in O(nk)O(nk) time.

Let V={v1,,vn}V=\{v_1,\ldots,v_n\} and Vj={v1,,vj}V_j = \{v_1,\ldots,v_j\}. Define D(i,j)=SVj,xSw(x)=ixSf(x)xV\Sg(x)\displaystyle D(i,j) = \sum_{S\subset V_j, \sum_{x\in S} w(x) = i} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) .

Certainly, SV,xSw(x)ZxSf(x)xV\Sg(x)=iZD(i,n)\displaystyle \sum_{S\subset V, \sum_{x\in S} w(x) \in Z} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) = \sum_{i\in Z} D(i,n)

We only incur a O(k)O(k) number of semiring operations once we compute all D(i,n)D(i,n) for 0ik0\leq i\leq k.

Let [P][P] be the Iverson bracket notation, namely

[P]={1if P is true;0otherwise.\displaystyle [P] = \begin{cases} \mathbb{1} & \text{if } P \text{ is true;}\\ \mathbb{0} & \text{otherwise.} \end{cases}

Theorem1
  1. D(i,0)=[i0]D(i,0) = [i \neq 0]

  2. For j1j\geq 1, D(i,j)=[iw(vj)]f(vj)D(iw(vj),j1)+g(vj)D(i,j1)D(i,j) = [i\geq w(v_j)] f(v_j)D(i-w(v_j),j-1) + g(v_j) D(i,j-1).

Proof

The base case can be verified easily, we show part of a inductive step.

f(vj)D(iw(vj),j1)+g(vj)D(i,j1)=f(vj)SVj1,xSw(x)=iw(vj)xSf(x)xV\Sg(x)+g(vj)SVj1,xSw(x)=i)xSf(x)xV\Sg(x)=vjSVj,xSw(x)=ixSf(x)xV\Sg(x)+vj∉SVj,xSw(x)=ixSf(x)xV\Sg(x)=SVj,xSw(x)=ixSf(x)xV\Sg(x)\displaystyle \begin{aligned} f(v_j)D(i-w(v_j),j-1) + g(v_j)D(i,j-1) &= f(v_j) \sum_{S\subset V_{j-1}, \sum_{x\in S} w(x) = i-w(v_j)} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) + g(v_j) \sum_{S\subset V_{j-1}, \sum_{x\in S} w(x) = i)} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) \\ &= \sum_{v_j\in S\subset V_j, \sum_{x\in S} w(x) = i} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) + \sum_{v_j\not\in S\subset V_j, \sum_{x\in S} w(x) = i} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x)\\ &= \sum_{S\subset V_j, \sum_{x\in S} w(x) = i} \prod_{x\in S} f(x) \prod_{x\in V\backslash S} g(x) \end{aligned}

Posted by Chao Xu on .
Tags: algorithm.