Word break with cost
Given a set of strings , a cost function and a string . Find elements such that , and is minimized.
This problem is a generalization of the word break problem on leetcode. Many algorithms you see online assumes that string in has constant length, checking the hash table takes time, and obtain an time algorithm. It is not as easy. Here we show an algorithm that considers the strings in have arbitrary length.
Consider the following graph , where , and there is an edge from and , if , and the label of the edge is the string , and the cost is . Let be number of substrings in matches some element in . The graph has edges. Note . Indeed, the sum of the length of the labels of all outgoing edges cannot be more than , and the length of each label is different. Hence each vertex can have at most outgoing edges. The graph is a DAG, so we can find the shortest path from to in linear time with respect to the number of edges. This shows if we can compute the graph in time, then we solve the problem in time.
We can build the Aho–Corasick automaton for in time. It can be used to find all substrings of that matches something in by traversing the automaton once. The running time is the total number of substrings matched, which is . Hence building the graph takes time. is clearly no more than , where . Also, it is also clear . Indeed, there can be at most edges start from , since each edge has a label of different length, and sum of those length labels is no larger than .
If we only want to know if there exists a solution, then there is a time algorithm . The algorithm is close to optimal assuming the algorithm is combinatorial and the alphabet can be arbitrarily large.
Can we obtain similar running time for the word break with cost problem? There are evidence against it. If the alphabet is unary, this problem is equivalent to the unbounded knapsack problem, which likely does not have an algorithm with running time for any  and can be as large as . Of course, this does not mean there might not be a time algorithm, since the reduction involved in the paper might not hold when we require .
 K. Bringmann, A. Grønlund, K.G. Larsen, A dichotomy for regular expression membership testing, in: 2017 Ieee 58th Annual Symposium on Foundations of Computer Science (Focs), 2017: pp. 307–318 10.1109/FOCS.2017.36.
 M. Cygan, M. Mucha, K. Węgrzycki, M. W, On problems equivalent to (min,+)-convolution, ACM Trans. Algorithms. 15 (2019) 14:1–14:25 10.1145/3293465.