# Represent an element in a free monoid with minimum weight

Consider a rank k free monoid (M,\cdot) with free generators G. Sometimes there are ways to express them by writing a little less than write the whole string of generators. We can group some generators by powers. For example, aababababaaaa = a(ab)^4a^4.

Find the shortest way to write down an element in a free monoid.

There are problems on how long are the parentheses, exponents etc. Therefore we generalize it to allow weight to those operations.

Formally. For any free monoid M with free generators G, we can construct another free monoid (M^*,\cdot),

- a\in G \implies Atom(a)\in M^*.
- a\in M^*, n\in\N, then Power(a,n) \in M^*.

Consider a homomorphism w:M^*\to \N. Such that for all n, it satisfy the following criteria:

- w(a)\leq w(b) \implies w(Power(a,n))\leq w(Power(b,n)),
- w(a)\leq w(Power(a,1)).

w is a weight function.

Let f:M^*\to M, such that

- f(ab) = f(a)f(b),
- f(Atom(a)) = a,
- f(Power(a,n)) = a^n.

Given a\in M, we want to find a'\in M^*, such that f(a') = a and w(a') is minimized.

The input is a_1\ldots a_n.

Let D(i,j) represent the minimum weight representation for a_i\ldots a_j. Let P(i,j) represent the set of all possible Power(x,k), such that f(Power(x,k)) = a_i\ldots a_j for some k\neq 1.

\displaystyle \begin{aligned} D(i,i) &= a_i\\ D(i,j) &= \min(P(i,j)\cup \{ D(i,k)+D(k+1,j)| i\leq k\leq j-1\}) \end{aligned}

Here \min return any of the expressions that achieves the minimum weight. This allows a O(n^3) algorithm if one uses suffix tree for finding P(i,j). One can naively try all possible Power(x,k) instead, where k|n.

Here is an Haskell code for it. It is designed to show the algorithm instead of been efficient. This has real life usage to compress regular expressions.