The Art Gallery Guardian

Represent an element in a free monoid with minimum weight

Consider a rank kk free monoid (M,)(M,\cdot) with free generators GG. Sometimes there are ways to express them by writing a little less than write the whole string of generators. We can group some generators by powers. For example, aababababaaaa=a(ab)4a4aababababaaaa = a(ab)^4a^4.


Find the shortest way to write down an element in a free monoid.

There are problems on how long are the parentheses, exponents etc. Therefore we generalize it to allow weight to those operations.

Formally. For any free monoid MM with free generators GG, we can construct another free monoid (M,)(M^*,\cdot),

  1. aG    Atom(a)Ma\in G \implies Atom(a)\in M^*.
  2. aMa\in M^*, nNn\in\N, then Power(a,n)MPower(a,n) \in M^*.

Consider a homomorphism w:MNw:M^*\to \N. Such that for all nn, it satisfy the following criteria:

  1. w(a)w(b)    w(Power(a,n))w(Power(b,n))w(a)\leq w(b) \implies w(Power(a,n))\leq w(Power(b,n)),
  2. w(a)w(Power(a,1))w(a)\leq w(Power(a,1)).

ww is a weight function.

Let f:MMf:M^*\to M, such that

  • f(ab)=f(a)f(b)f(ab) = f(a)f(b),
  • f(Atom(a))=af(Atom(a)) = a,
  • f(Power(a,n))=anf(Power(a,n)) = a^n.

Given aMa\in M, we want to find aMa'\in M^*, such that f(a)=af(a') = a and w(a)w(a') is minimized.

The input is a1ana_1\ldots a_n.

Let D(i,j)D(i,j) represent the minimum weight representation for aiaja_i\ldots a_j. Let P(i,j)P(i,j) represent the set of all possible Power(x,k)Power(x,k), such that f(Power(x,k))=aiajf(Power(x,k)) = a_i\ldots a_j for some k1k\neq 1.

D(i,i)=aiD(i,j)=min(P(i,j){D(i,k)+D(k+1,j)ikj1})\begin{aligned} D(i,i) &= a_i\\ D(i,j) &= \min(P(i,j)\cup \{ D(i,k)+D(k+1,j)| i\leq k\leq j-1\}) \end{aligned}

Here min\min return any of the expressions that achieves the minimum weight. This allows a O(n3)O(n^3) algorithm if one uses suffix tree for finding P(i,j)P(i,j). One can naively try all possible Power(x,k)Power(x,k) instead, where knk|n.

Here is an Haskell code for it. It is designed to show the algorithm instead of been efficient. This has real life usage to compress regular expressions.

Posted by Chao Xu on .
Tags: Haskell, monoid.