Sum over products of weighted subset of certain size

Consider a commutative semiring $(R, +, \cdot)$ . $0$ is the identity for $(R, +)$ , and $1$ is the identity for $(R, \cdot)$ . Let $f, g : V \to R$ , $w : V \to N$ and $Z \subset N$ . It is common that we are interested in computing expressions of the following form.

$S \subset V, \sum_{x \in S} w (x) \in Z \sum x \in S \prod f (x) x \in V \ S \prod g (x)$

Examples:

If $w (x) = 1$ for all $x$ , and $f (x)$ be the probability that event $x$ occurs, $g = 1 - f$ , we find the probability that the number of event occurs $t$ times, where $t \in Z$ . In probability, this is computing the Poisson distribution.
If $(R, +, \cdot) = (N, +, \cdot)$ , $f = g = 1$ , for all $x$ and $w (x) = x$ and $V \subset N$ and $Z = {t}$ , then we find the number of subsets that have element sum $t$ .
If $(R, +, \cdot) = (N, max, +)$ , $V \subset N$ , $g = 0$ and $Z = {0, \dots, W}$ , then this solves the knapsack problem with knapsack size $W$ , value $f$ and cost $w$ .
An actual application inspired this post: An automated test suite that runs $n$ subtests, and it is allowed to rerun a subtest if it fails the first time. A subtest passes if first run passes or the rerun passes. The test is successful if all the subtests passes and the number of total reruns is at most $k$ . Assume probability of passing is independent for each subtest. One want to estimate the probability of a successful test given the probability a run passes for a specific subtest.

Let $max Z = k$ and $∣ V ∣ = n$ . The naive algorithm runs in $O (n 2^{n})$ time (assuming semiring operation takes $O (1)$ time). There is a common transformation that turns this problem that sum over all subsets to a problem that sums over $Z$ . So it runs in $O (nk)$ time.

Let $V = {v_{1}, \dots, v_{n}}$ and $V_{j} = {v_{1}, \dots, v_{j}}$ . Define $D (i, j) = S \subset V_{j}, \sum_{x \in S} w (x) = i \sum x \in S \prod f (x) x \in V \ S \prod g (x)$ .

Certainly, $S \subset V, \sum_{x \in S} w (x) \in Z \sum x \in S \prod f (x) x \in V \ S \prod g (x) = i \in Z \sum D (i, n)$

We only incur a $O (k)$ number of semiring operations once we compute all $D (i, n)$ for $0 \leq i \leq k$ .

Let $[P]$ be the Iverson bracket notation, namely

$[P] = {10 if P is true; otherwise.$

Theorem1

$D (i, 0) = [i \neq = 0]$
For $j \geq 1$ , $D (i, j) = [i \geq w (v_{j})] f (v_{j}) D (i - w (v_{j}), j - 1) + g (v_{j}) D (i, j - 1)$ .

Proof

The base case can be verified easily, we show part of a inductive step.

$f (v_{j}) D (i - w (v_{j}), j - 1) + g (v_{j}) D (i, j - 1) = f (v_{j}) S \subset V_{j - 1}, \sum_{x \in S} w (x) = i - w (v_{j}) \sum x \in S \prod f (x) x \in V \ S \prod g (x) + g (v_{j}) S \subset V_{j - 1}, \sum_{x \in S} w (x) = i) \sum x \in S \prod f (x) x \in V \ S \prod g (x) = v_{j} \in S \subset V_{j}, \sum_{x \in S} w (x) = i \sum x \in S \prod f (x) x \in V \ S \prod g (x) + v_{j} \neq \in S \subset V_{j}, \sum_{x \in S} w (x) = i \sum x \in S \prod f (x) x \in V \ S \prod g (x) = S \subset V_{j}, \sum_{x \in S} w (x) = i \sum x \in S \prod f (x) x \in V \ S \prod g (x)$

Posted by Chao Xu on 2014-08-11.

Tags: algorithm.